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EXPRESS MAHNO. 

KINASteS AND PHOSPHATASES 

TECHNICAL FIELD 

The inveadoB relates to novel nadeic adds, Jdnases and phosphatases encoded by these 
mideic adds, and to the use of these nacleic adds and piotdns in die diagnosis, tieatm^ and 
pievwition of cardiovascular diseases, immane sjretem disordm, neurological disord^s, disordm 
affecting growth and devdopment, Iqad disorders, cdl proliferative disorders, and cancers. The 
invention also relates to Ihe assessment of the diects of exogenous confounds on flie expression of 
nucldc adds and kinases and pliosphatases. 
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BACKGROUND OF THE INVENTION 

Reversible protein phosphorylation is the ubiquitous strategy used to control many of the 
mtraceflular events in eukaryotic cells. It is estimated that more than ten p^ent of proteins active in 
a typical mamm alian cell are phosphorylated. Kinases catalyze the transfe of hi^-energy phosphate 

15 groups fiom adenosine triphosphate (ATP) to target proteins on the hydroxyamino acid residues 
serine, threonine, or tyrosine. Phosphatases, in contrast, remove these phosphate groups. 
ExtraceDular signals mcluding hormones, neurotransnitters, and growth and differentiation factors 
can activate kinases, which can occur as ceU surface receptors or as flie activator of the final effector 
protein, as wdl as other locations along the signal transduction pathway. Cascades of kinases occur, 

20 as well as kinases sensitive to second messenger molecules. This system allows for the amplification 
* of weak signals (low abundance growth factor molecules, for oxarnplo), as well as the synthesis of 
many weak signals into an aU-or-noflung response. Phosphatases, then, are essential in determinmg 
the extent of phosphoiylation in the cdl and, together with Idnases, regulate key cellular processes 
sudi as metaboKc enzyws^ activity, proliferation, cdl growth and diffiaratiation, cefl adhesion, and 

25 cdl cyde progression. 
KINASES 

Kinases conq^rise the laigest known enzyme superfanrHy and vary widdy in fhdr target 
molecules. Kinases catalyze the transfer of high eneigy phosphate groups ftom a phosphate donor to 
- a phosphate acceptor. . Nudeotides usuaDy serve as the phosphate dcMlD^^ 
30 kinases utilizmgadMiosme triphosphate (ATP), Thephosphateaccqitorcanbeany of avarietyof 
molecules, includiog nucleosides, nucleotides, lipids, carbohgrdrates, and protdns. Proteins are 
phosphorylated on hydroxyanuno acids. Addition of a phosphate group alt^ the loi:al diaige on fb& 
acceptor molecule, causing internal conformationd dianges and potentially influencing 
intermolecular contacts. Reversible protem phosphorylation is the primaiy method fbr regulating 
protein activity in eukaiyotic cells. la general, protdns aie activated by phosphorylation in response 
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to extracellular signals such as homiDiies. neurotransmitters, and growth and diiicRaitiation foctois. 
The activated proteins initiatB the ceOl's intraceOolar response by way of intracdhdar signaliiig 
pathways and second messotger molecules such as cyclic nucleotides. caldnm^ahnDdulm. inositol, 
and various mitogens, fliat rebate protan jdiosplioijd^tton. 
5 Kinases are involved in all aspects of a cdl's fbnction. fiombasic metabolic processes, sudi 

as glycolysis, to cell-cycle regulation, diffeientiation. and communicatioa vnOx the extraceUular 
enviromirait through signal transduction cascades. Inappropriate phosi*oiylation of protons in cdls 
has been linked to changes mceUcyde progression and cell diffwentiation. Changes in flie cdl cycle 
have been linked to induction of apoptosis or cancer. Changes in ceU diffeientiation have been linked 
10 to diseases and disorders of the reproductive system, immune systan. and skeletal mosde. 

There are two classes of protein kinases. One class, protein tyrosine kinases CPTKs). 
phosphorylates tyrosine residues, and the other class, prot«n serine^Oireonine kinases (STKs). 
phosphoijdates serine and threonine residues. Some PTKs and STKs possess structural 
diaracteristics of bofli femiUes and have dual specifidty for both tyrosme and serine/flireonine 
residues. Ahnost aH kinases contain a conserved 250-300 amino acid catalytic domain containmg 
spedfic residues and sequence motifs characteristic of the kinase family. The protein kinase catalytic 
domain can be further divided into 1 1 subdomains. N-termmal subdomams I-IV fold into a two-lobed 
structure whidi binds and ori«ots the ATP donor molecule, and subdomam V spans the two lobes. C- 
termind subdomains VI-XI bind the protdn substrate and transfer the gamma phosphate fiom ATP to 
20 thehydroxylgroupof a tyrosme. serine, or Ihieomne residue. Each of the 11 subdomains contains 
specific catalytic residues or amino add motifs diaracteristic of that subdomain. For exanple. 
subdomain I contams an 8-amino add ^dne-ridi ATP bindmg consensus motif, subdomain n 
contains a critical Ursine residue required for maximd catalytic activity, and subdomains VI throu^ 
DC comprise flie higjily c^erved catalytic core. PTKs and STKs also contain distinct sequence 
25 i»tifemsubdomains VI and VrnMiidimay confer hydroxyaminoa^^^ 

In addition, kinases may also be dassified by additional amino add sequences. generaOy 
between 5 and 100 residues, ^didflier flank or occur within the kinase domam. These additional 
amino add sequences regulate kinase activity and determine substrate spedfidty. (Reviewed in 
■ Haidie. G. and S. Hanks (1995) Hie Proteii.-Kin«^. v^tc b»»v a^^i w -^7 7ft frrfHrThrr Tfr uT " 

30 San Diego CA.). In particuhir. two protdn kinase signature sequences have been identified in flie 
kinase domam. the first contaming an active site lysine residue involved in ATP binding, and the 
second containing an aspartate residue hnportant for catalytic activity. If a protdn analyzed mdudes 
the two protdn kinase signatures, the probabilily of Oat protdn bdng a protdn kinase is dose to 
100% (PROSITE: PDOCOOlOO, Noveiriber 1995). 
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Protein Tyrosine Kinase 

Protdn ^^rosiiie kinases (PTKs) may be classified as dth^ transineinbrane, leceptor PTKs or 
iioiitraiismeinbrane,]iDiireceptDrPl^ TransmMoabrane ^osine kinases fiinction as 

recq>tQrs for most growth factors. Growth factors bind to the receptor tyrosine kinase (RT^ 
5 causes the receptor to phosphorylate itsdf (autophosphoiylation) and specific intraceDidar second 
messenger protdns. Growth factors (GF) that associate with receptor PTKs include ^idecmal GF, 
platelet-derived GF, fibroblast GF, hepatocyte GF. insulin and insidinrlike GFs, nerve GF, vascular 
oidothelial GF» and macrophage colony stimulating factor. 

Nontransmembrane, nonrecq>tor PTKs lack transmmfl)rane regions and, instead, form 

10 signa l ing complexes with the cytosolic domains of plasma membrane receptors. Receptoi^ fliat 

function flirough non-receptor PTKs inchide tiiose for cytokines and hormones (growth hormone and 
prolactin), and antigen-specific receptors on T and B lymphocytes. 

Many PTKs were first identified as oncogene products in cancer cells in v/bich PTK 
activation was no longer subject to nonml cellular controls. In fact, about one ttdrd of the known 

15 oncogenes OTCode PTKs. FurfliennDre, ccHular transformation (oncogenesis) is often accoii5)anied 
by mcreased tyrosine phosphorylation activity (Charbonneau. H. and N.K. Tonks (1992) Annu. Rev. 
Cell Biol. 8:463-493). Regulation of PTK activity may ttierefore be an iin)ortant strategy in 
controlling some types of cancer. 
Protein Serine/Threonine Kinases 

20 Protdn serine/threonine kinases (STKs) are nontransmembrane proteins. A subclass of STKs 

are known as ERKs (extracdhilar signal regulated kinases) or MAPs (nitogen-activated protein 
kinases) and are activated after cdl stimodationby a variety of hormones and growth factors. Cell 
stinmlation mduces a signa l ing cascade leading to phosphorylation of MEK (MAP/ERK kinase) 
which, in turn, activates ERK via smne and threomne phosphorylation. A varied number of protdms 

25 rq>resent the downstream effectors for the active ERK and wopUcatR it in the control of cdl 

proliferation and differentiation, as weD as regulation of die cytoskdeton. Activation of ERK is 
normally transient, and cells possess dual specificity phosphatases that are responsible for its down- 
r^iulation. Also, numerous studies have shown that elevated ERK activity is associated with soi» 

. • cancers. Odier STKs indude the second messCTg^ d&pendsaat protean Idnases'siicli astfie ~ '"^ 

30 cydic-AMP dep^mlent protdn kinases (PKA), calchmhcaln»dulm (CaMT) dependrat protdn kinases, 
and flie nutogen-activated protdn kinases CMAP); the cyclin-dependmt protdn kinases; checlqioint 
and cen qrcle kmases; Numb-assodated kmase (Nak); human Fused OiFu); prolifoation-xdated 
kinases; 5 -AMP-activated protdn kinases; and kinases involved in apoptosis. 

One member of the ERK fannly of MAP kinases, ERK 7, is a novd 61-kDa protdn that has 

35 motif simQarities to ERKl and ERK2, but is not activated by extracelhilar stimuli as are ERKl and 
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ERK2 nor by the common acdvatois, c-JunN*terminal kinase (JNIQ and p38 Idnase. ERK7 regulates 
its nndear localization and iidiibition of growth Ihrougl^ C-tmmnal tail, not through the kinase 
domain as is typical witti other MAP Idnases (Abe, MIL (1999) MoL Cdl. BioL 19:1301-1312). 
The second messesogea' dqiendent jproftdn kinases primarily mediate the effects of second 
5 mBSsengers such as cydic AMP (cAME^* cyclic GMP» inositol triphosphate, phosphatidylinositol, 
3,4,S-triphosphate, cyclic ADP ribose, arachidonic add, diacylglycerol and calchun-calmodulin. The 
PKAs are involved in mediating hormone-induced cellular responses and are activated by cAMP 
produced within the cdl in response to honnone stimulation. cAMP is an intracellular mediator of 
hormone action in all animal cells that have been studied. Hormone-induced cellular responses 

10 indnde Ifayroid hormone secretion, Cortisol secretion, progesterone secretion, gflycogen breakdown, 
bone resorption, and regulation of heart rate and force of heart musde contraction. PKA is fiaind in 
all animal cells and is thougiht to account for the effects of cAMP in most of these cells. Altered PKA 
expression is implicated in a variety of disorders and diseases including cancer, fliyroid disorders, 
diabetes, atherosclerosis, and cardiovascular disease (Isselbadier, K.J. et al. (1994) Hanison*s 

15 Principles of Intemal Medicine . McGraw-Hill, New York NY, pp. 416-431, 1887). 

Ihe casein kinase I (CKI) gene fanuly is another subfamily of serine/threonine protein 
kinases. This continuously expanding group of kinases have been iiiq>licated in the regulation of 
numerous cytoplasmic and nuclear processes, including cell inetabolism and DNA r^lication and . 
repair. CKI enzymes are pres^ in the meinbranBs, nucleus, cytoplasm and cytoskeleton of 

20 eukaryotic cells, and on the mitotic spindles of mammalian cells (Fish, KJ. et al. (1995) J. Biol. 
Chem. 270:14875-14883). 

The CKI family members all have a short amino-terminal domain of 9-76 amino acids, a 
highly conserved kinase domain of 284 amino acids, and a viable carboxyl-terminal domain that 
ranges from 24 to over 200 amino acids in length (Cegielska, A. et al. (1998) J. Biol. Chem. 

25 273:1357-1364). The CKI family is con^)rised of highly related proteins, as sc&i by the id^itification 
of isofbrms of casdn Idnase I firom a varied of sources. There are at least five mammalian isoforms, 
a» P> Yf &> and s. Fish et al. identified CKI-qpsilon firom a human placenta cDNA library. It is a basic 
protdn of 416 annuo acids and is closest to CKI-ddta. Through reconibinant expression, it was 
detmmied to phosphorylate known CKI substrates mi was'mhibited^by the CKI^spedfic ii&tbitor" 

30 CKI-7. The human gene for CKI-epsilon was able to rescue yeast with a slow-growth phraiotype 
caused by deletion of die yeast CKI locus, HRR250 (Fish et al., supra). 

The mammalian drcadian mutation tau was found to be a senndominant autosomal allele of 
CKI-q)silon that markedly shortens period length of circadian rhythms in Syrian hamsters. The tau 
locus is encoded by casdn kinase I-epsilon, which is also a homolog of the Drosophila circadian gene 

35 double-tioK. Studies of both die wildtype and tau mutant CKI-epsilon ^ozyme indicated that the 
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mutant nzyme has a noticeable reduction in the maximum velocity and autophospboiylation state. 
Fordid, in vitro, CKI-q>silon is able to interact widi mammalian PERIOD piotdns» while the mutant 
razyme is defidrat in its abilit/ to pbosphoxylate VESIOD. Lowiey et al. have proposed liiat CKI- 
qpsilon plays a major role in ddaying die negative feedback signal wifliin the transcxqidon- 
5 translatioorbased autoregulatory loop fliat conyoses the core of the ciicadian fnerihfinigm Thoefbre 
the CKI-epsilon ^rzyme is an ideal target for pharmaceudcal conopounds infhiemcing circadian 
rfrythms, j^-lag and sleep, in addition to othb- physiologic and nostabolic processes under circadian 
regolatton (Lowr^, P.L. et al. (2000) Sdrace 288:4S3-491). 

Homeodomain-interacting protein kinases (HDPKs) are serine/threonine kinases and novel 

10 members of flie DYRK kinase subfamily (Hofimann, T.G. et al. (2000) Biodiimie 82: 1 123-1 127). 
HIPKs contain a conserved protein kinase domain separated fiom a domain that interacts with 
homeopioteins. HIPKs are nuclear kinases^ and HIPK2 is highly expressed in neuional tissae (Kjm. 
Y.H. et al. (1998) J. Biol. CSiem. 273:25875-25879; Wang, Y. et al. (2001) Biochim. Biophys. Acta 
1 5 1 8: 1 68-172). HIPKs act as corepressors for homeodomian^transcription factors. This compressor 

15 activity is seen in posttranslational modifications sucb as ubiquitination and phosphorylation, eadi of 
which are important in the regulation of cellular protein function (Kim, Y.H. et al. (1999) Pioc. Natl. 
Acad. Sci. USA 96:12350-12355). . 

The human h- warts protein, a homolog of Drosophila warts tumor suppressor gene, maps to 
chromosome 6q24-25.1. It has a serine/tiireonine kinase domain and is localized to centrosomes in • 

20 interphase cells. It is involved in mitosis and functions as a component of the mitotic apparatus 
(Nishiyama, Y. et al. (1999) FEBS Lett 459:159-165). 
Calcium-Calmodulin Dencndent Protein Kinases 

Calcium-calmodulin dependent (CaM) kinases are involved in regulation of smooth muscle 
contraction, glycogen breakdown (phosphorylase kinase), and neurotransmission (CaM kinase I and 

25 CaM kinase II). CaM dependent protein kinases are activated by calmodulin, an intracellular calcium 
recqptor, in response to the concentration of j&ee calcium in the cell. Many CaM kinases are also 
activated by phosphorylation. Some CaM kinases are also activated by autophosphorylation or by 
other regulatory kinases. CaM kinase I phosphorylates a variety of substrates including the 
neurotransmitter-related protdns synapsin I and H, the gene transcription regulator, CREB, and flie 

30 cystic fibrosis conductance regulator protein, CFTR (Haribabu, B. et al. (1995) EMBO J. 14:3679- 
3686). CaM kinase n also phosphorylates synapsin at diffmnt sites and controls die synthesis of 
catecholamines in Ihe brain through {Aosphorylation and activation of tyrosine hydroxylase. CaM 
kinase II controls the ^thesis of catecbolanmnes and seratonin, through phosphorylation/activation 
of tyrosine hydroxylase and tryptophan hydroxylase, respectively (Pujisawa, H. (1990) BioEssays 

35 12.*27-29). TheinRNAmicoding a calmodulin-bindingprotdnkinase-l^ 
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CTrirliM IP fTfrafiimalian fatrffirftin This piotdnte assodated wilh vesicles inbolli axoiis and 
d«[idrites and acxunndates laigely post^ The annno acid sequence of Oiispiotdn is sinslar to 
CaM-d^eodent STKs» and the protdn binds cahnodulinin the presence of caldnmCGodbcmt, M. dt 
aL (1994) J. Neurosci. 14:1-13). 

5 Mitogen-Actlvated Protein K ^nagps 

The nntog^activated protdn kinases (MAP), which mediate signal transduction from the 
cell surface to die nucleus via phosphorylation cascades, are anofh^ STK family that regulates 
intraceDidar signaling padiways. Several subgroups have been identified, and each nmnifests 
differait substrate specificities and responds to distinct extraceOular sthnuli (Egan, S.E. and R.A. 

10 Weinberg (1993) Nature 365:781-783). Hiere are three Idnase modules comprising the MAP kinase 
cascade: MAPK (MAP), MAPK Mnase (MAP2K, MAPKK, or MKIQ, and MKK kinase (MAP3K, 
MAPKKK, OR MEKK) (Wang,X.S. et al (1998) Biochem. Biophys. Res. Comnmn. 25333-37). Hie 
extracdlular-regulated kinase CBRIQ pathway is activated by growth factors and nntogens, for 
exanqile, qadermal growth factor (BGF), ultraviolet light, hyperosmolar medium, heat shock, or 

15 endotoxic lipopolysaccharide (LPS). The closely related though distinct parallel pathways, the c-Jun 
N-termina] kinase (JNK), or stress-activated kinase (SAPK) pathway, and the p38 kinase pathway are 
activated by stress stimoli and proinflammatory cytokines such as tumor necrosis factor (TNF) and 
interleukin-.l (IH); Altered MAP kinase expression is inq)licated in a variety of disease conditions 
including cancer, inflammation, immune disorders, and disorders affecting growth and development. 

20 MAP Idnase signaling pathways are present in mammalian cells as well as in yeast 
Cvdin-Dependent Protein Kinases 

The cyclin*dependent protein kinases (CDKs) are STKs that control the progression of cells 
through the ceD cycle. Hie entry and exit of a cell frommitosis are regulated by die synthesis and . 
destruction of a fansly of activating proteins called cyclins. Cyclins are small regulatory proteins diat 

25 bind to and activate CDKs, which then phosphor^ate and activate selected proteins involved in the 
mitotic process. CDKs are unique in diat they require multiple inputs to become activated. In 
addition to cyclin binding, CDK activation requires the phosphorylation of a specific threonine 
residue and the dephosphoxylation of a specific tyrosine residue on the CDK. 
' Another fanmly of STKs associated with die cell cycle are the NIMA(nev^ in Steals)-" 

30 related Idnases (Neks). Both CDKs and Neks are involved in duplication, maturation* and sqiaration 
of the noicrotubule organizing crater, the centrosome, in animal cells (Fry> A.M. et aL (1998) EMBO 
J. 17:470-481), 

Checkpoint and Cdl Cvde Kinases 

In the process of cell division, tiie otdsr and timing of cell cycle transitions are under control 
35 of cell cycle checkpoints* which ensure that mtical events sucb as DNA r^lication and chromosome 
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segregation are carried out with predsion. If DNA is damaged, e.g. by radiation, a chedqmint 
padiway is activated tiiataiiests Che cell cycle to provide tiniefb^ If the damage is extensive, 

apoptosis is induced. In die absmce of sudi checlqpoints, fhe damaged DNA is inherited by aberrant 
cens\vfaich may cause prolifmtive disorders such as cancOT. Protdn kinases play an important role 
5 in this process. For eKanQ>le, a spedfic kinase, diedqioint kinase 1 (Chkl), has been identified in 
yeast and mammals , and is activated by DNA damage in yeast. Activation of Chkl leads to die airest 
4 of die cell at die G2/M transition (Sanch^, Y. et al. (1W7) Science 277:1497-1501). Specifically, 
Chkl phosphorylates die cdl division qrcle phosphatase CDC25, inhibiting its normal fiincticn ^ch 
is to d^osphorylate ai^ activate the cydin-dep^ent kinase Cdc2. Cdc2 activation controls fhe 
10 entiy of cells mto mitosis (Paig, C.-Y. et aL (1997) Science 277:1501-1505). Thus, activation of 
Chkl prevents the damaged cell firomatering nntosis. A deficiency in a checkpoint kinase, sudi as 
Chkl , may also contribute to cancer by faihire to arrest cells with damaged DNA at oth» checlqpoints 
such as G2/M 

Proliferation-Related Kinastss 

15 Proliferation-related kinase is a seram/cytoldne inducible STK that is involved in regulation 

of die cell cycle and cell proliferation in human naegakarocytic cells (li, B. et al. (1996) J. BioL 
• Chem. 271:19402-19408). Proliferation-related kinase is related to the polo (derived from 
Drosophila polo gene) fanily of STKs inqiKcated in cell division. Proliferation-related kmase is 
downregulated in lung tumor tissue and may be a proto-oncogene whose deregulated expression in 

20 normal tissue leads to oncogenic transformation. 
S^«AMP*actlvated protein kinase 

A ligand-activated STK protein kinase is 5 -AMP-activated protein kinase (AMPIQ (Gao, G. 
et al. (1996) J. Biol Chem. 271:8675-8681). Mammalian AMPK is a regulator of fat^ add and sterol 
syndiesis through phosphorylation of the enzymes acetyl-CoA carboxylase and 

25 hydroxymefliylglutaryl-CoA reductase and mediates responses of fliese pathways to cellular stresses 
such as heat shock and depletion of glucose and ATP. AMPK is a heterotrimeric complex comprised 
of a catalytic alpha subunit and two non-catalytic beta and gamma subunits that are believed to 
regulate the activity of the alpha subunit Subumts of AMPK have a much wider distribution in 
non-Upogooic tissues sudi as brain, heart, sipleen^ This distributimTsuggests 

30 that its role may extend beyond regulation of lipid metabolism alone. 
Kinases in Anontosis 

Apoptosis is a higUy r^;ulated signalmg padiway leaduqg to cdl death that plays a crucial 
role in tissue development and homeostasis. Dmgolation of this process is associated wifli the 
padiogenesis of a numb^ of diseases including autoimmune diseases, neurodegenerative disorders, 
35 andcancw. Various STKs play key roles m diis process. ZIP kmase is an STK contauung a 
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C-tenmnal leucine zipp^ domain in adcfid^ This 
C-tenomnal domain appears to mediate homodimmzation and activation of the kinase as well as 
interactions with transcription factors such as activating transcription factor, ATF4, a meniber of die 
cydic-AMP responsive elemmt binding protdn (ATF/CREB) fanmly of transcriptional factors 

S (Sanjo. H. et aL (1998) J. Biol. Ch^ 273:29066-29071). DRAKl and DRAK2 are STKs diat share 
homology with tihe death-associated protein kinases (DAP kinases), known to function in interferon^ 
induced apoptosas (Sanjo et al., supra). Like ZIP kinase, DAP kinases contain a C-temdnal 
protdn-protein inieracticm domain, in die form of an^^ repeats, in addition to the N-tmimial 
Idnase domain. ZIP, DAP, and DRAK kinases induce morphological changes assodated with 

10 apoptosis ^en tramfected into NIH3T3 cells (Sanjo et al. , supra). Howev^, del^on of dther the 
N-tmninal kinase catalytic domain or the C-terminal domain of these protdns abolishes apoptosis 
activity, indicating that in addition to die kinase activity, activity in the C-tenmnal domain is also 
necessary for apoptosis, possibly as an interacting domain with a regulator or a specific substrate. 

RICK is another STK recently identified as mediating a specific apoptotic pathway involving 

15 the death receptor, CD95 (Inohara, N. et al. (1998) J. Biol. Chem. 273:12296-12300). CD95 is a 
member of the tumor necrosis factor receptor superfamily and plays a critical role in the regulation 
and homeostasis of the immune system (Nagata, S. (1997) CeQ 88:355-365). The CD95 receptor 
signaling patiiway involves recruitment of various intracellular molecules to a receptor conc^Iex 
following ligand binding. This process includes recruitment of the cysteine protease 'caspase-8 . 

20 which, in turn, activates a caspase cascade leading to ceD death. RICK is con^osed of an N-terminal 
kinase catalytic domain and a C*traninal '"caspase-recruitment" domain that interacts witii 
caspase-like domains, indicating that RICK plays a role in die recruitment of caspase-S. This 
interpretation is supported by the fact diat the expression of RICK in human 293T cells promotes 
activation of caspase-8 and potentiates the induction of apoptosis by various proteins involved in the 

25 CD95 apoptosis pathway (Inohara et al., supra). 
Mitochondrial Protein Kinases 

A novd class of eukaryotic kinases, related by sequence to prokaryotic histidine protdn 
kinases, are the mitochondrial protein kinases CMDPKs) ^ch seem to have no sequence similarity 
" with other eukaryotic protein kinases. These prot^ Idnases'are located exchisividy iii Ae 

30 nutochondrial matrix space and may have evolved firom g^ies ori^nally present in respiration- 

dep^ent bacteria which were endocytosed by prinutive eukaryotic cells. MPKs are responsible for 
phosphorylation and inactivation of the branched-diain alpha-ketoacid dehydrogenase and pyruvate 
ddiydrogenase complexes (Harris, R.A. et al. (1995) Adv. Enzyme Regul. 34:147-162). Five MPKs 
have been identified. Four members correspond to pjrravate dehydrogenase kinase isozymes, 

35 regulating the activity of the pyruvate ddiydrogpnase complex, which is an important regulatory 
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enqrim at the iiitaiace between ^yco]^is aiidth^ The fifth mmber coixesponds to 

a branched-cliam alpha-kstoacid debydiogenase Idnase, impoitant in ttie regulation of the pathway for 
the disposal of bnmdied-chain annno acids. (Hanris, R.A. et al. (1997) Adv. Enzyne Regul. 37:271- 
293). Bodi starvation and the diabetic state aiB known to lesuk in a great increase in 

5 the pyruvate ddydrogoase kinase in the livOT, heart and inascle of the rat 1^ 

in both disease states to die phosphorylation and inactivation of the pyruvate ddiydrogenase con^lex 
and conservation of pyruvate and lactate, for gjhiconeogenesis (Harris (1995) supra). 
KINASES WITH NON-PROTEIN SUBSTRATES 
Lipid and Inositol kinases 

10 Lipid Idnases phosphorjlate hydroxyl residues on lipid head groups. A fanoily of kinases 

involved in phosphorylation of phosphatidylinositol (PI) has been described, each member 
phosphorylating a specific carbon on the inositol ring (Leevers, S J. et al. (1999) Curr. Opin. Cell. 
Biol. 1 1 :2 19-225). The phosphorylation of phosphatidylinositol is involved in activation of the 
protein kinase C signaling pathway. The inositol phospholipids (phosphoinositides) intracellular 

15 signaling pathway begins with binding of a signaling molecule to a G-protein linked receptor in the 
plasma mi^xibrane. This leads to the phosphorylation of phosphatidylinositol (PI) residues on the 
inner side of the plasma mernbrane by inositol kinases, thus converting PI residues to the biphosphate 
state (PIP2). PIP2 is then cleaved into inositol triphosphate (IP3) and diacylglycerol. These two 
products act as mediators for separate signaling pathways. Cellular responses that are mediated by 

20 . these padiways are glycogen breakdown in the liver in response to vasopressin, smooth muscle 
contractton in response to acetylcholine, and thronibin-induced platdet aggregation. 

PI 3-ldnase (PI3IQ, which phosphorylates the D3 position of PI and its dmvatives, has a 
central role in growth factor signal cascades involved in cdl growth, diffemitiation, and metabolism 
PBK is a hetensdimer consisting of an adapter subunit and a catalytic subunit The adapter subunit 

25 acts as a scaffolding protein, intmcting with specific tyrosine-phosphorylated proteins, lipid 
moires, and oth^ cytosolic factors. When the adapter subunit binds Qmsine pho^horylated 
targets, such as the insulin responsive substrate (IRS)-1, the catalytic subunit is activated and converts 
PI (4,5) bisphosphate (PIP2) to PI (3,4,5) P3 (PIP3). PIP3 then activates a numb^ of other proteins, 
including PKA, protein kinase B (PKB), protdn kinase C (PKC); glycogm syndiase kinase (GSK)'-3;' 

30 and p70 ribosomal s6 kinase. PI3K also inteyracts directly widi the (^toskeletal orgamzing protons, 
Rac, xho, and cdc42 (Shepherd, P.R. et al. (1998) Biochem. J. 333:471-490). Animal models for 
diabetes, such as obese and /a/ mice, have altered PI3K adapter subunit levels. Specific mutations in 
the adapter subumt have also been found in an insidin-resistant Danish population, suggesting a role 
for PI3K in type-2 diabetes (Shepard, siq^rd). 

35 An example of lipid kinase phosphozylation activi^ is the phosphoiylation of 
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D-«ytfaro-sphii|gosme to tihe sphiogofipid iiiBtaboJite, sphin^ane-l-phosphate (SPP). SPP has 
emerged as a novd lipid secQod-niBSsenger with both extracdhdar and intraceDular actions (Kohama, 
T. et aL (1998) J. BioL Chem. 273:23722-23728). Bxtracettidarly, SPP is a ligand for G-protdn 
coupled recq)tor EDG-1 (^othelial-d^ved, G-protdn coupled receptor). Intracdhilarify, SPP 

5 regulates ceU growth, survival, motiUty, and cytoskdetal diang^^ SPP levels are regulated by 

sphiqgosine kinases that spedfically phosphorylate D-eiythro-splungosine to SPP. The in^ortance of 
sphingosine kinase in cell signaling is indicated by the fact diat various stimuli, induding 
• platelet-dmved growth fector (PDGF), n^ve growth factor, and activation of protein kinase C, 
increase cdhdar levds of SPP by activation of sphingosine kinase, and the fact ttiat conq^etitive 

10 inhibitors of the emyme sdectivdy inhibit cell proliferation induced by PDGF (Kohama et al.. 

Purine Nudeotide Kinases 

The purine nucleotide kinases, adenylate kinase (ATP:AMP phosphotransferase, or AdK) and 
guanylate kinase (ATPiGMP phosphotransferase, or GuK) play a key role in nucleotide metabolism 

15 and are crucial to the synthesis and regulation of cellular levds of ATP and OTP, respectively. These 
two iDDlecules are precursors in DNA and RNA synthesis in growing cells and provide the primary 
source of bioch^iucal energy in cells (ATP), and signal transduction pathways (GTP). Inhibition of 
various sUsps in the synthesis of these two roolecules has been the basis of many antiproliferative 
drugs for cancer and antiviral tiierapy (PiUwein, K. et al. (1990) Cancer Res. 50:1576-1579). 

20 AdK is found in almost all cell types and is especially abundant in cells having high rates of 

ATP synthesis and utilization such as skeletal muscle. In these cells AdK is physically assodated 
with nntochondria and myofibrils, the subcellular structures that are involved in energy production 
and utilization, respecti vdy . Recent studies have demonstrated a major function for AdK in 
transferring higjh energy phosphoryls from metabolic processes gentsmting ATP to cedlular 

25 componmts consurcnng ATP CZeleznikar, R.J. et al. (1995) J. Biol. Chem. 270:7311-7319). Thus 
AdK may have a pivotal role in maintaining energy production in ceDs, particularly tibose having a 
high rate of growth or metabolism sudi as cancer cells, and may provide a target for suppression of its 
activity in ord^ to treat c^taincancm. Alternatively, reduced AdK activity may be a source of 
vaHous metabolic, muscle-eao^gy disofdmlSiat can result in cardiac or itspiratorylailurBlu^ may Se' 

30 treatable by increasing AdK activity. 

GuKI, in addition to providing a key step in the synthesis of GTP for RNA and DNA 
synthesis, also fulfills an essoatial function in signal transduction pathways of cdls through the 
regulation of GDP and GTP. Specifically, GTP binding to membrane assodated G proteins mediates 
the activation of cdl rec^tors, subsequmt intracdlular activation of adeiiyl cyclase, and production 

35 of die second messenger, cyclic AMP. GDP binding to G proteins inhibits these processes. GDP and 
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GTP levds also control the activity of certain oncogoiic proteins such as pZl'" loxown to be involved 
in control of cell prolifisration and oncogenesis (Bos, J.L. (1989) Canc^ Res. 49:4682-4689). Hig^ 
ratios of GTP:GDP caused by suppre^on of GuK cause activation of pZl"^ and promote 
oncogenesis. Increasing GuK activity to increase levds of GDP and reduce flie GTP:GDP ratio may 
5 provide a therapeutic strategy to reverse oncogenesis. 

GuK is an important en^me in the phosphorylation and activation of certain antiviral drags 
iiseful in flie treatment ofherpes virus infections. These drugs include the guanine homologs 
acyclovir and buciclovir (MiUer, W.H. and R.L. NfiUer (1980) J. BioL CSiem. 253:7204-7207; 
Stenb^g, K. ^ al. (1986) J. Biol. Chem. 261:2134-2139). Increasing GuK activity in infected cells 

10 may provide a therapeutic strategy for augmenting die effioctiv^^ss of these drags and possibly for 
reducii^ the necessary dosages of die drags. 
Pvrimfdine Kinases 

The pyrimidine kinases are deoxycytidine kinase and tiiymidine kinase 1 and 2. 
Deoxycytidine kinase is located in the nucleus, and thymidine kinase 1 and 2 are found in the cytosol 

15 (Johansson, M. et aL (1997) Proc. Natl. Acad. ScL USA 94:1 1941-11945). Phosphorylation of 

deoxyribonucleosides by pyrimidine kinases provides an altemative padiway for de novo synthesis of 
DNA precursors. The role of pyrimidine kinases, like purine kinases, in phosphorylation is critical to 
the activation of several chemotherapeutically in^ortant nucleoside analogues (Amer B.S. and S. 
Eriksson (1995) Pharmacol- Ther. 67:155-186). 

20 PHOSPHATASES 

Protein phosphatases are generally characterized as either serine/tiu'eonine- or tyrosine- 
speciJSc based on their preferred phospho-amino acid substrate. However, some phosphatases (DSPs, 
for dual specificity phosphatases) can act on phosphorylated tyrosine, serine, or threonine residues. 
The protein serine/threonine phosphatases (PSPs) are inq)ortant regulators of many cAMP-mediated 

25 hormone responses in cdls. Protein tyrosine phosphatases (PTPs) play a significant role in cell cycle 
and cell signaling processes. Another fanuly of phosphatases is the add phosphatase or histicUne acid 
phosifliatase (HAP) fannly whose merdbers hydrolyze phosphate esters at acidic pH conditions. 

PSPs are found in the cytosol, nucleus, and mitochondria and in association with cytoskdetal 
and no^nbranous stractures in most tissues, especially the brain* Some PSPs require'divdieifcatibns, 

30 such as Ca^* or Mr?\ for activity. PSPs play important roles in glycogen metabolism, muscle 
contraction, protdn synOiesis, T cell function, neuronal acti^ty, oocyte maturation, and hepatic 
metabolism (reviewed in Cohen, P. (1989) Annu. Rev. Biochem. 58:453-508). PSPs can be separated 
mto two classes. The PPP class includes PPl, PP2A PP2B/calcineuriii, PP4, PP5. PP6, and PPT. 
Menobers of this class are conqiosed of a homologous catalytic subunit bearing a very taigbly 

35 conserved signature sequence, coupled with one or more regulatory subunits (PROSITE 
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PDOCOOllS). Farther intoractions wMi scaffold and anchoring imlecales dete nranR fha ifirmfii*niiiflr 
localization of PSPs and substrate specificity. Hie PPM class consists of seversd closely related 
isofonns of PP2C and is evoluttonarily unrelated to the PPP class. 

PPl dephosphorylates many of the protdns phosphorylated by <^clic AMP-depradent piotein 
5 ldnase(PKA) and is an inqwrtam regulator of macycAMP-n^^ A 
number of isofimns have been identified, Midi the alpha and beta forms being produced by alternative 
splicing of the same gene. Bolb ubiquitous and tissue-specific targeting proteins for PPl have been 
identified. In the brain, inhibition of PPl activity by flie dopannne and adenosine 3*^'. 
nxmophosphate-regulated phosphoprotdn of 32kDa (DARPP-32) is necessary for normal dopamine 

10 response in neostriatal neurons (reviewed in Price, N.E. and M.C. Mumby (1999) Cuir. Opin. 
Neurobiol. 9:336-342). PPl , along with PP2A, has be^ shown to limit motility in im(»t>vascular 
endothelial cells, suggesting a role for PSPs in the inhibition of angiogenesis (Gabel, S. et aL (1999) 
Otolaryngol. Head Neck Surg. 121 :463-468). 

PP2 A is the main serine/threonine phosphatase. The core PP2 A enzyme consists of a sisigle 

15 36 kDa catalytic subunit (C) associated with a 65 kDa scaffold subunit (A), whose role is to recruit 
additional regulatory subunits (B). Three gene families eiK:oding B subimits are known (PR55, PR61, 
and PR72), each of which contain multiple isoforms, and additional femilies may exist (Millward, 
T.A et al. (1999) Trends Biosci. 24:186-191). These "B-type" subunits are cell type- and tissue- 
specific and determine the substrate specificity, enzymatic activity, and subcellular localization of the 

20 holoenzyme. The PR55 family is MgUy conserved and bears a conserved inotif (PROSITE 

PDOC00785). PR55 increases PP2A activity toward mitogen-activated protein kinase (MAPK) and 
MAPK kmase (MEK). PP2A dephosphorylates the MAPK active site, inhibiting the cell's entry into 
mitosis. Several protdns can compete wiflh PR55 for PP2 A core enzyme binding, mcluding the CKH 
kmase catalytic subunit, polyomavirus middle and small T antigens, and S V40 small t antigen. 

25 Viruses may use this mechanism to commander PP2 A and stimulate progression of the cell through 
die cell cycle (PaDas, D.C. et al. (1992) J. Virol. 66:886-893). Altered MAP kmase expression is also 
implicated in a variety of disease conditions including cano^, mflammation, immune disorders, and 
disorders affecting growth and development. PP2 A, in fact, can dqdiosphoxylate and modulate flie 
activities of more than 30 protein kinases m vitro^ and other evidence suggests that tte same is^e in 

30 vivo for ^ch kinases as PKB, PKC, the cahnodulin-dependeiit kmases, ERK femily MAP kinases, 
qrduirdependent kinases, and the IiA kinases (reviewed m Millward &t al., supra). PP2A is itself a 
substrate for CH and CKH kinases, and can be stinoulated by polycationic macromolecules. A PP2A- 
like phosphatase is necessary to maintain the Gl phase destmction of mammalian cydins A and B 
(Bastians, H. et al. (1999) Mol. BioL Cdl 10:3927-3941). PP2A is a major activity m the bram and is 

3S implicated in regulating neurofilament stability and normal neural function, particularly the 
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phosphoi^atioii of the ixucrotubule-associated protein tau. Hyperphosphojcylation of tau has been 
proposed to lead to Hie neuronal deg«eiationseeQm Alzheim^'s disease (reviewed in Price and 
MiinAy, suprd). 

PP2B, or caldneorin, is a Ca'^-acdvated dinoeric phosphatase and is pardcularly abundant in 
5 the brain. It consists of catalytic and regulatory subunits, and is activated by llie binding of the 
calcium/cahnodUlincpnqdeK. Caldneurin is the target of the imnmnosuppressant drugs c^clo^ 
and FK506. Along wifli oilier cellular factors, these drugs interact with caldneurin and inhibit 
phosphatase activity. In T cells, this blocks the calcium depraxdrat activation of die NF-AT fanooly of 
transcription factors, leadii^ to imnsinosuppression. This family is widely distributed, and it is likely 
10 that caldneuiin regulates gene expression in other tissues as welL In nrarons, calcineurin modulates 
functions which range fiiom the inhibition of neurotransnitter release to desensitization of 
postsynaptic NMDA-receptor coupled calcium diannels to long term n^nnry (reviewed m Price and 
Mumby, supra). 

Other menabers of the PPP class have recently been identified (Cohen, P.T. (1997) Trends 
15 Biochem. Sci. 22345-251). One of them, PP5, contains regulatory domains with tetratricop^tide 
repeats. It can be activated by polyunsaturated fatty acids and anionic phospholipids in vitro and 
appears to be involved in a number of signaling pathways, including those controlled by atrial , 
natriuretic peptide or steroid hormones (reviewed in Andreeva, A. V. and M. A. Kutuzov (1999) Cell 
Signal. 11:555-562). 

20 PP2C is a -42kDa monomer with broad substrate specificity and is dependent on divalent 

cations (mainly Mn^ or Mg^O for its activity. PP2C proteins share a conserved N-terminal region 
with an invariant DGH motif, which contains an aspartate residue involved in cation binding 
(PROSITE PDOC00792). Targeting proteins and mechanisms regulating PP2C activity have not 
heca ideaitified. PP2C has he&i shown to inhibit die stress-responsive p38 and Jun kinase (JNK) 

25 pathways (Tato*awa, M. et al. (1998) EMBO J. 17:4744-4752). 

In contrast to PSPs, tyrosuM^specific phosphatases (PTPs) are generaUy monomeric proteins 
of v^ diverse size (&om20kDa to greater than lOOkDa) and stractuie that function primarily m the 
transduction of signals across ttie plasma membrane. PTPs are categorized as dther soluble 

^ phosphatases or transmemiirane receptor proteins that contain a phosphatase dom^ AD PTPs ihare ' 

30 a cons^i^ catalytic domain of about 300 anmm acids which contains the active site. The active site 
consensus sequence includes a cystine residue which executes a nucleophilic attack on the phosphate 
moiety during catalysis (Ned, B.G. and N.K. Tonks (1997) Curr. Opm. Cell Biol. 9:193-204) . 
Recqitor PTPs are made up of an N-tenranal extracelhdar domain of variable lei^gth, a 
transmmibrane re^on, and a cytoplasmic region that generally contains two copies of the catalytic 

35 domaia Althoughonly flie fcst copy seems to have enzymatic activity, the second c^^ 
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affects die substrate specificity of flie first The extraceDcaar damaios of some leceptor^PTPs contain 
fibionectin-Iike repeats^ innnunoglobuliii-like domains, MAM domains (an extracdhdar motif likely 
to have an adhesive function), or carbonic anhydrase^like domains (PROSITE PDOC 00323). This 
wide variety of structural motifs accounts for die div^ty in size and specifidty of PTPs. 
5 PTPs play important roles in biolo^cal processes such as cell adhesion, lynqihocyte 

activation, and cell proliferatioa PTPs and k are involved in ceU^cell contacts, peifaaps regulating 
cadherin/catemn function. A nundba* of PTPs affect cdl spreading, focal adhesions, and cell motility, 
most of them via flie int^grin/tyrosine kinase signalmg pathway (reviewed in Neel and Tonks, suprd). 
CD45 phosphatases regulate signal transduction and lyn^hocyte activation (Ledb^t^, J. A. et aL 

10 (1988) Pxoc. Natl. Acad. Sci. USA 85:8628-8632). Sohible PTPs contauiing Sic.hon»logy-2 
domains have beai identified (SHPs), suggesting that these molecules might interact with receptor 
tyrosine kinases. SHP-1 regulates cytokine receptor signaling by controlling the Janus fandly PTKs 
m hematopoietic cells, as well as signalmg by the T-cdl rec^tor and c-Kit (reviewed in Need and 
Tonks, suprd). M-phase inducer phosphatase plays a key role in the induction of mitosis by 

15 dephosphorylating and activating the PTK CDC2, leadmg to cell division (Sadhu, K. et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:5139-5143). In addition, the genes encodmg at least eight PTPs have 
been mapped to chromosomal re^ons that are translocated or rearranged in various neoplastic 
conditions, including lyn5)homa, small cell lung carcinoma, leukemia, adenocarcinoma, and 
neuroblastoma (reviewed in CSiarbonneau, H. and N.K. Tonks (1992) Annu. Rev. Cell Biol. 8:463- 
• 20 493). The viif enzyme active site con^rises the consensus sequence of the MTMl gene family. The 
MTMl gene is responsible for X-linked recessive myotubular myopathy, a congenital muscle 
disord^ fliat has been linked to Xq28 (Kioschis, P. et al., (1998) Genomics 54:256-266). Many PTKs 
are mcoded by oncograes, and it is well known that oncogenesis is often accompanied by increased 
tyrosine phosphorylation activity. It is th^efore possible that PTPs may serve to prevent or reverse 

25 cdl transformation and the growth of various cancers by controlling the levels of tyrodne 

phosphorylation in cells. Tins is supported by studies showing that overexpression of FTP can 
suppress transformation in cdls and that specific inhibition of PTP can enhance cdl transformation 
(Charbonneau and Tonks, suprd). 

-Dual spedficity phosphatases (DSPs) are structurally more sinulaf to the PTPs than £he PSPs. 

30 DSPs bear an »t^ided PTP active site motif widi an additional 7 anuno acid residues. DSPs are 
primarily associated viddi cell proliferation and include the cell cycle regulators cdc25A, B, and C. 
The phosphatases DUSPl and DUSP2 inactivate the MAPK family membeis ERK (extraceUular 
signal-regulated kmase). JNK (c-Jun N-teminal Mnase), and p38 on bofli tyrosine and tiueonine 
residues (PROSITE PDOC 00323, supra). In the activated state, diese Idnases have been implicated 

35 in neuronal differmtiation. proliferation, oncogenic transformation, platelet aggregation, and 
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apoptosis. Thus, DSPs aie necessary for prop^ reguladon of these processes (Niuda, M. et aL (1996) 
J. BioL Chem. 271:27205-27208). The tumor suppressor PTEN is a DSP that also shows lipid 
phosphatase activity. It seems to negatively regulate intmctioDs with the extracellular matrix and 
ma i ntains s^isitivity to apoptosis. PTEN has been iiiQflicated in the prev^on of angio^nesis (Giri, 
5 D. and M. Ittmann (1999) Hum. Pathol. 30:419-424) and abnonnalities in its expression are 
associated wifli numranus cancers (reviewed in Tamura, M. et aL (1999) J. Natl. Cancer InsL 
91:1820-1828). 

Histidine acid phosphatase (HAP; EXPASY EC 3.1.3.2), also known as acid phosphatase, 
hydrolyzes a wide spectrum of substrates including all^, aiyl, and acyl orthophosphate monoesters 

10 and phosphorylated proteins at low pH. HAPs share two regions of cons^ved sequ^ices, each 
centered around a bistidine residue which is involved in catalytic activity. Members of the HAP 
family include lysosomal acid phosphatase (LAP) and prostatic acid phosphatase (PAP), both 
sensitive to inhibition by L-tartrate (PROSITE PDOC00538). 

Synaptojanin, a polyphosphoinositide phosphatase, dephosphorylates phosphoinositides at 

15 positions 3, 4 and 5 of the inositol ring. Synaptojanin is a major presynaptic protein found at clathrin- 
coated endocytic intermediates in ni^rve terminals, and binds the clathrin coat-associated protdn, 
EPS 1 5. This binding is mediated by the C-temnnal region of synaptojanin- 170, which has 3 Asp-Pro- 
Phe ammo add repeats. Further, fliis 3 residue repeat had been found to be the binding site for ttie 
EH domains of EPS15 (Haf&ier. C. et aL (1997) FEES Lett 419:175-180). AdditionaDy, 

20 • synaptojanin may potrntiaUy regulate interactions of endocytic proteins with the plasr^ 

and be involved in synaptic vesicle recycling (Brodin, L. et al. (2000) Curr. Opin. Neurobiol. 10:312- 
320). Studies in mice with a targeted disruption in the synaptojanin 1 gene (Synj 1) were shown to 
support coat formation of endocytic vesicles more diectively than was seen in wild-type nuce, 
suggesting that Synjl can act as a negative regulator of membrane-coat protdn interactions. These 

25 findings provide genetic evidence for a crucial role of phosphoinositide metabolism in synaptic 
vesicle recycling (Crenwna» O. et al. (1999) Ceffl 99:179-188). 

ExprassioTi prrafilinpr 

Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of 

molecules spatially distributed over, and stably assodated with, the surfece of a solid support 

30 Microarrays of polypqptides, polynucleotides, and/or antibodies have been developed and find use in 
a variety of applications, such as gene sequencing, monitorir^ gei^ expression, gene mapping, 
bacterial identification, drug discovery, and combinatorial dienmstry. 

(}Qe area in particular in wlucAndcroairays find use is in gene expression axial^^ Array 
technology can provide a simple way to explore die expression of a single polymorphic gene or the 
35 expression profile of a large number of related or unrelated genes. When the expression of a single 
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gene is exanuned, arrays aieei]q>loyedtodetect the expi^ion of a specific gene or its vana^ 
an expression profile is exanmed, arrays provide a platform for idendfying gpnes that are 
tissue specific* are affected by a sobstaiKrebdpg tested in a toncology assay, are part of a signaling 
cascade^ cany out housdce^ong functions, or are specifically related to a particular genetic 

S predisposition, condition, disease, or disorder. 
Nenroloacal disordm 

Characterization of region-specific gem mpressionin the human brain provides a context 
and baclcground for molecular neurobiology on a variety of neurological disorders. For example, 
Alzheimer's disease (AD) is a progressive, neurodestructive process of the human neocort«, 

10 characterized by the deterioration of memory and higher cognitive function. A progressive and 

irreversible brain disorder, AD is characterized by three major pathogenic episodes involving (a) an 
aberrant processing and deposition of beta-anD^rloid precursor protdn (betaAPP) to form neurotoxic 
beta-ano^loid (betaA) peptides and an aggregated insoluble polymer of beta A that forms the s^iile 
plaque, (b) the establi^m^nt of intraneuronal neuritic tau pathology yielding widespread deposits of . 

15 agyrophilic neurofibrillary tangles (NFT) and (c) the initiation and proliferation of a brain-specific 
inflammatory response. These three seemingily disperse attributes of AD etiopathogenesis are linked 
by ihc fact that proinflamn^tory microgilia, reactive astrocytes and their associated cytokines and 
chenookines are associated with the biology of the microtubule associated protein tau, betaA 
speciation and aggregation. Missense mutations in the presenilin graes PSl and PS2, inq>licatedih. 

20 early onset faimlial AD, cause abnormal betaAPP processing with resultant overproduction of 
,betaA42 and related neurotoxic peptides. Specific betaA fragments such as betaA42 can further 
potentiate proinflammatory mechanisms. Expression of the inducible oxidoreductase 
cyclooxygenase-2 and cytosolic phospholipase A2 (cPLA2) is strongly activated during cerebral 
ischemia and trauma, epilepsy and AD, indicating die induction of proinflammatory gene pathways as 

25 a response to brain injury. Neurotoxic noetalssudi as alunonum and anc, both impfica^ 

etiopathogenesis, and aracMdonic add, a major metabolite of bram cPLA2 activity, each polymerize 
hypophosphorylated tau to form NFT-Uke bundles. Studies have identified a reduced risk for AD in 
patients aged over 70 years vfbo wm previously treated with non-steroidal anti-inflammatory drugs 
for non-CNS afflictions that include arthritis. (For a roview of the interrdationshi^ betwe^ fhe^' ' 

30 mechanisms of PS 1 , PS2 and betaAPP gene expression, tau and betaA deposition and the induction, 
regulation and proliferation in AD of the neuroinflammatory response, see Lukiw, W.J, and Bazan, 
N.G. (2000) Neurochem. Res. 2000 2S:1173-1184). 
Breast cancCT 

More than 1 80,000 new cases of breast cancer are diagnosed each year, and the mortality rate 
35 for breast canco* approaches 10% of all deaths in females betwem the ages of 4S-S4 (Gish, K. (1999) 
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AWIS MagazioB 28:7-10). However flie survival rate based cm cady diagnosis of localized breast 
cancer is cxtremdy high (97%). compared vnOi the advanced stage of the disease in which fbs tannr 
has spread beyond the breast (22%). Current procedures for clinical breast examination are huOang in 
sensitivity and specificity, and efforts are underway to develop conqirdiHisive gene expression 

5 proffles for breast cancer that may be used in conjunction with conveotionai screening mel^ to 
hx9>rove diagnosis and prognosis of diis disease ^earau, CM, et al. (2000) Nature 406:747-752). 

Mutations in two genes, BRCAl and BRCA2. are known to greatly predispose a woman to 
breast cancer and may be passed on from parents to diildrrai (Gish, supni). However, this type of 
hereditary breast cancer accounts for only about 5% to 9% of breast cancers. wIuIb die vast majority 

0 of breast cancer is due to non-inherited mutations that occur in breast epithelial cdls. 

The relationship betwe«i expression of epidermal growth factor (EGF) and its rec^tor. 
EGFR, to human mammary carcinoma has heea particularly wdl studied. (See Khazaie. K. et al. 
(1993) Cancer and Metastasis Rev. 12255-274. and references cited therein for a review of this area.) 
Overexpression of EGFR, particularly coupled wth down-regulation of the estro^ receptor, is a 

S mark» of poor prognosis in breast cancer patients. In addition, EGFR expression in breast tumor 
metastases is frequenfly elevated relative to die primary tumor, suggesting tiiat EGFR is involved in 
tumor progression and metastasis. Ihis is supported by accumulating evidence tiiat EGF has effects 
on ceO fimetions related to metastatic potential, such as cell motiUty, ch«anotaxis. seCTetion and 
difBaentiation. Changes in expression of other members of the erbB receptor famOy, of M*ich EGFR 

> is one. have also been impUcated inbr^ canc». The abundance of erbB receptors, such as HER- 
2/neu, HER-3. andHmi-4, and ttidr ligands in breast cancw points to Iheir functional in^rtance in 
die pathogenesis of the disease, and may Oierefore provide targets for therapy of the disease (Bacus, 
S.S. et al. (1994) Am. J. Clin. Patiiol. 102:S13-S24). Other Icnown markers of breast cancer include a 
human secreted fidzzled i»otein mRNA that is downregidated in breast tumois; die matrix Gla 
protdn which is overejqpressed in human breast cardnoma cdls; Drgl or RTF. a gene whose 
expression is dmnnished in colon, breast, and prostate tumors; maspin. a tumor suppressor goie 
dowmt^ated in invasive breast carcinomas; and CaN19. a member of the SlOO protein famify. all of 
which are dowDrn^ated in mammary carcinoma cells relative to normal mammary epidielial cells 
(?hou, Z. et al. (1998)' Int J. Cancer 78:95-99; Chen, L. et aL (1990) Qnoigene 5:1391-l3M; WSi, ' 
W. et al (1999) FBBS Lett 455:23-26; Sager. k. et al. (199© Cuir. Top. MicrobioL Immunol 213^1- 
64; and Lee. S. W. et aL (1992) Proc. Natl Acad. So. USA 89:2504-2508). 

Cell lines doived fiomfauman mammary epididial oeDs at various stages of breast cancer 
provide a usefW model to stody die process of malignant transformation and tumor progression as it 
has been shown fliat diese ceD lines retain many of die properties of dwir parental tumors for lengthy 
culture periods (Wistuba, I.L et aL (1998) Oin. Cancer Res. 42931-2938). Such a modd is 



17 



PF-1724P 

particularly useftd for comparing pbenolypic and molecolar characteristics of human mammary 
epithelial cells at various stages of malignant transformation. 

HieiB is a need in the art frar new compositions, including nucleic acids and proteins, fof the 
diagnosis, prevention, and treatment of cardiovascular diseases, immune system disorders, neurological 
5 disorders, disorders affecting growth and development, lipid disorders. ceU proliferative disorders, and 
cancors. 

SUMMARY OF THE INVENTION 
Various embodiments of die invention provide purified polypeptides, kinases and phosphatases, 

) referred to collectively as 'KPP' and individually as 'KPP.l,' *KPP-2,' *KPP-3,* •KPP-4,' 'KPP-S ' 
•KPP-6,- 'KPP-7.' -KPP-S,' 'KPP-P,' 'RPP-IO.' 'BCPP-ll.' 'KPP.12,' 'KPP-U,' 'KPF-U,' and 
•KPP-15' and methods for using tfiese proteins and their encoding polynucleotides for the detection, 
diagnosis, and treatment of diseases and medical conditions. Embodiments also provide methods for 
utilizing the purified kinases and phosphatases and/or their encoding polynucleotides for facilitating die 

i drag discovery process, including determination of efficacy, dosage, toxicity, and pharmacology. 
Related embodiments provide metiiods for utilizmg die purified kinases and phosphatases and/or dieh- 
encoding polynucleotides for investigating die padiogenesis of diseases and medical conditions. 

An embodim^t provides an isokited polypeptide selected ftom tfie group consisting of a) a 
polypeptide comprising an amino acid sequence selected ftfxa die group consisting of SEQ ID NO:l- 

I 15, b) a polypeptide comprising a naturaUy occurring amino acid sequence at least 90% identical or at 
least about 90% identical to an amino acid sequence selected fixan die group consisting of SEQ ID 
NO:l-15, c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
ftom die group consisting of SEQ ID NO:l-15, and d) an immunogenic fragment of a polypeptide 
having an amino acid sequence sheeted fixrni die group consisting of SEQ ID NO:l-15. Anotiier 
embodiment provides an isolated polypeptide comprising an amino acid sequence of SEQ ID NO: 1-15. 

StiU anodier embodiment provides an isolated polynucleotide encoding a polypeptide selected 
from die group consisting of a) a polypeptide comprising an ammo acid sequence selected from die 
group consisting of SEQ ID NO:l-15. b) a polypeptide conqnismg a naniraily 6coiirini raihio abid 
sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected 
fiom die group consisting of SEQ ID NO:l-15, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected ftom die group consisting of SEQ ID NO: 1-15, and d) an 
immunogenic ftagment of a polypeptide having an amino acid sequence selected from die group 
consisting of SEQ ID NO:l-15. In anottier embodiment, die polynucleotide encodes a polypeptide 
selected fix>m die group consisting of SEQ ID NO:l-15. In an alternative embodiment, die 
polynucleotide is selected from die group consisting of SEQ ID NO: 16-30. 
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Still another embodiment provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from tiie group consisting 
of SEQ ID NO:l-15, b) a polypeptide comprising a naturaUy occurring amino acid sequence at least 
5 90% identical or at least about 90% identical to an amino acid sequence selected from ttie group 

consisting of SEQ ID NO: 1-15. c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from die group consisting of SBQ ID NO:l-15, and d) an immunogenic fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-15. 
Anotfier embodiment provides a cell transformed witii die recombinant polynucleotide. Yet another 
10 embodiment provides a transgenic oiganism comprising tfiereconibinant polynucleotide. 

Another embodunent provides a mediod for producing a polypeptide selected from die group 
consisting of a) a polypeptide con^sing an amino acid sequence selected from die group consisting 
of SEQ ID NOil-lS. b) a polypeptide comprising a namially occurring amino acid sequence at least 
90% identical or at least about 90% identical to an ammo acid sequence selected from the group 
15 consisting of SEQ ID NO:1.15. c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from die group consisting of SEQ ID NO: 1-15. and d) an immunogenic fragment of 
. a polypeptide having an amino acid sequence selected from die group consisting of SEQ ID NO: 1-15. 
The mediod comprises a) culturing a ceU under conditions suitable for expression of die polypeptide, 
wherein said cell is transformed wifli a recombinant polynucleotide comprising a promoter sequence 
20 operably Unked to a polynucleotide encoding die polypeptide, and b) recovering die polypeptide so 
expressed. 

Yet anodier embodiment provides an isolated antibody which specifically binds to a 
polypeptide selected from die group consisting of a) a polypeptide comprising an amino acid sequence 
selected from die group consisting of SEQ ID NO:l-15, b) a polypeptide comprising a naturally 
25 occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid 
sequence selected from die group consisting of SEQ ID NO:l-15. c) a biologicaUy active fragment of 
a polypeptide having an amino acid sequence selected fixMn die group consisting of SEQ ID NO: 1-15, 

and d) an immunogenic fragment of a polypeptide having an amino acid sequence sele^t^dfro^A^ 
group consisting of SEQ ID NO:l-I5. 

► Still yet anodier embodiment provides an isolated polynucleotide selected from die group 

consisting of a) a polynucleotide comprising a polynucleotide sequence selected from die group 
consisting of SEQ ID NO:16-30. b) a polynucteotide comprising a naturaUy occumng polynucleotide 
sequence at least 90% Identical or at least about 90% identical to a polynucleotide sequence selected 
from die group consisting of SEQ ID NO:16.30, c) a polynucleotide complementary to the 
polynucleotide of a), d) a polynucleotide complementary to die polynucleotide of b). and e) an RNA 
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equivalent of a)-d). In other embodiments, the polynucleotide can comprise at least about 20, 30, 40, 
60, 80, or 100 contiguous nucleotides. 

Yet anotfier embodiment provides ^method for detecting a target polynucleotide in a sample, 
said target polynucleotide being selected from the group consisting of a) a polynucleotide comprising a 
5 polynucleotide sequence selected from die group consisting of SEQ ID NO: 16-30, b) a polynucleotide 
comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% 
identical to a polynucleotide sequence selected from tiie group consisting of SEQ ID NO: 16-30, c) a 
polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the 
polynucleotide of b), and e) an RNA equivalent of a)-d). The method comprises a) hybridizing the 

10 sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence 

complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to 
said target polynucleotide, under conditions whweby a hybridization complex is formed between said 
probe and said target polynucleotide or fragments tiiereof, and b) detecting the presence <x absence of 
said hybridization complex. In a related embodiment, the method can include detecting the amount of 

15 tire hybridization complex, bi still other embodiments, die probe can comprise at least about 20, 30, 
40, 60, 80, or 100 contiguous nucleotides. 

Still yet another embodiment provides a method for detecting a target polynucleotide in a 
sample, said target polynucleotide being selected from the group consisting of a) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:16-30, b) a 

20 polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO: 16-30, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to die polynucleotide of b), and e) an RNA equivalent of a)-d). The metiiod 
comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain 

25 reaction amplification, and b) detecting die presence or absence of said amplified target polynucleotide 
or fragment thereof. In a related embodiment, ttie metiiod can include detecting the amount of the 
amplified target polynucleotide or fragment thereof. 

Another embodiment provides a composition comprising an effecdve Wount of a pSypeptide 
selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 

30 from die group consisting of SEQ ID NO:l-15, b) a polypeptide comprising a naturally occurring 
amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence 
selected from die group consisting of SEQ ID NO:l-15, c) a biologically active fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-15, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 

35 group consisting of SEQ ID NO: 1-15, and a pbarmaceutically acceptable excipient In one 
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embodimoit, die composition can conq>rise an amino acid sequence selected from the group consisting 
of SEQ ID NO:l-15. Other embodiments provide a mediod of treating a disease or conditio 
associated with decreased ot abnormal expression of functional KPP, comivising administering to a 
patient in need of such treatment the conq>osition. 
5 Yet anc^o- embodiment provides a method for screening a compound for effectiveness as an 

a&aast of a polyp^tide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from die group consisting of SEQ ID NO:l-15. b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical ot at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-15, c) a biologically active 

10 fragment of a polypeptide having an amino acid sequence selected from flie group consisting of SEQ 
ID NO:l-15, and d) an immunogenic fragmmt of a polypeptide having an amino acid sequrace 
selected from die group consisting of SEQ ID NO:l-15. The metiiod comprises a) contacting a 
sample conqnising die polypeptide widi a compound, and b) detecting agonist activity in the sample. 
Anodira^ embodiment provides a composition con^sing an agonist compound identified by the mediod 

15 and a phaimaceutically acceptable excipient. Yet anodier embodiment provides a metiiod of treating 
a disease or condition associated wifli decreased ^joession of frinctional KPP, comprising 
administering to a patient in need of such treatment the composition. 

Still yet anodier embodiment provides a metiiod for screening a compound for effectiveness 
as an antagonist of a polyp^tide selected from the group consisting of a) a polypeptide comprising an 

20 amino acid sequence selected from die group consisting of SEQ ID NO: 1-15, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical ot at least about 90% 
idrantical to an amino acid sequence selected from tiie group consistmg of SEQ ID NO: 1-15, c) a ' 
biologically active fragment of a polypeptide having an amino acid sequence selected from die group 
consisting of SEQ ID NO: 1-15, and d) an immunogenic fragment of a polypeptide having an ammo 

25 acid sequence selected from die group consisting of SEQ ID NO:l-15. The metiiod comprises a) 

ccmtacting a sample cranprising tiie polyp^tide wifli a compound, and b) detecting antagonist activity 
in die sample. Anotiier onbodiment provides a composition comprising an antagonist compound 
identified by tfie mediod and a pharmaceilticallyacc^table excipient Yet another embo^UmMt 
provides a metiiod of treating a disease or condition associated with overexpression of functional KPP, 

30 comprising administering to a patient in need of such treatment die composition. 

Anodier embodiment provides a method of screening for a compound that specifically binds to 
a polypeptide selected from die group consisting of a) a polypeptide comprising an amino acid 
sequence selected from die group consisting of SEQ ID NO: 1-15, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 

35 amino acid sequence selected fiwn die group consisting of SEQ ID NO: 1-15, c) a biologically active 
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ftagment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:l-15, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected firom the group consisting of SEQ ID NO:l-15. The mediod comprises a) combining the 
polypeptide witfi at least one test compound under suitable conditions, and b) detecting bindmg of the 
5 polypeptide to ttie test compound, diereby identifying a compound tiut specifically binds to tiie 
polypeptide. 

Yet anotiier embodiment provides a metiiod of screening for a compound tiiat modulates die 
activity of a polypeptide selected from die group consisting of a) a polypeptide comprismg an amino 
acid sequence selected from die group consisting of SEQ ID N0:l-15, b) a polypeptide comprishig a 

10 naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from tiie group consisting of SEQ ID NO:1.15, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from tiie group consisting of SEQ 
ID NO:l-15. and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from tiie group consisting of SEQ ID NOil-lS. Tlie metfiod comprises a) combining tfie 

15 polypeptide witii at least one test compound und» conditions permisave for tiie activity of tfie 

polypeptide, b) assessing tiie activity of tiie polypeptide in tiie presence of tiie test compound, and c) 
comparing die activity of tiie polypqitide in tiie presence of tiie test compound wifli tiie activity of tiie 
polypeptide in tiie absence of tfie test compound, yrherein a change in tiie activity of tfie polypeptide in 
tfie presence of tfie test compound is indicative of a compound tiiat modulates tiie activity of tfie 

20 polypeptide. 

StiH yet anotiier einbodiment provides a metfiod for screenmg a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from tiie group consisting of SEQ ID NO:16-30, die metiiod 
comprising a) contacting a sample conqnising tiie target polynucleotide witfi a compound, b) detecting 
25 altered expression of tfie target polynucleotide, and c) comparing tfie expression of tfie target 
polynucleotide in tiie presence of varying amounts of tiie compound and in tiie absence of tfie 
compound. 

Anotfier embodiment provides a metfiod for assessing toxicity of a test cdaiiKnin^^ 
metiiod comprising a) treating a biological sample containing nucleic acids widi tfie test compound; b) 
hybridizing tfie nucleic acids of die treated biological sample witfi a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from tfie group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from tfie group consisting of SEQ ID NO:16-30, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from tiie group consisting of SEQ ID 
NO: 16-30, iii) a polynucleotide having a sequoice con^lementary to i). iv) a polynucleotide 
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complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs 
under conditions whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide selected from the group consisting of 
i) a polynucleotide comprising a polynucleotide sequence selected from tfie group consisting of SEQ 
5 ID NO: 16-30. ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical or at least about 90% identical to a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:16-30, iii) a polynucleotide complementary to the polynucleotide of i), iv) a 
polynucleotide complementary to die polynucleotide of ii). and v) an RNA equivalent of iHv). 
Alternatively, tfie target polynucleotide can comprise a fragment of a polynucleotide selected from the 
10 groi^ consisting of i>v) above; c) quantifying the amount of hybridization complex; and d) comparing 
the amount of hybridization complex in the treated biological sample with the amount of hybridization 
con^lex in an untreated biological sample, wherein a difference in tfie amount of hybridization 
complex in die treated biological sample is mdicative of toxicity of the test compound, 

15 BRIEF DESCRIPTION OF THE TABLES 

Table 1 summarizes the nomenclature for full length polynucleotide and polypeptide 
embodiments of the invention. 

Table 2 shows the GenBank identification number and annotation of die nearest GenBank 
homolog. and flie PROTEOME database identification numbers and annotations of PROTEOME 
20 database homologs. for polypeptide embodiments of die invention. The probability scores for tiie 
matches between each polypeptide and its homolog(s) are also shown. 

Table 3 shows structural features of polypeptide embodiments, including predicted motifs and 
domains, along witfi the metiiods, algorithms, and searchable databases used for analysis of the 
polypeptides. 

25 Table 4 lists Uie cDNA and/or genomic DNA fragments which were used to assemble 

polynucleotide embodiments, along witii selected fragments of the polynucleotides. 

Table 5 shows representative cDNA libraries for polynucleotide embodiments. 
Tafile 6 provides ah appendix which cfescribes the tissues and vectors used for construction of 
die cDNA libraries shown in Table 5. 
30 Table 7 shows die tools, programs, and algorithms used to analyze polynucleotides and 

polypeptides, along witii applicable descriptions, references, and threshold parameters. 

Table 8 shows single nucleotide polymorphisms found in polynucleotide sequences of die 
invention, along wiUi allele frequencies in different human populations. 
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Before the present proteins, nucleic acids, and methods axe described, it is understood that 
embodiments of the invention are not limited to the particular machines, instruments, materials, and 
methods described, as these may vary. It is also to be understood tfiat the terminology used hesein is 
for the purpose of describing particular embodiments only, and is not intended to limit the scope of die 
5 invention. 

As used herein and in the appended clauns, the singular forms "a," "an," and "the" include 
plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "a 
host cell" includes a plurality of such host cells, and a reference to "an antibody" is a reference to one 
or more antibodies and equivalents thereof known to those skilled in the art, and so forth. 

10 Unless defined odierwise, all technical and scientific terms used herein have the same 

meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. 
Altfiough any machines, materials, and methods similar or equivalent to those described herein can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
described. All publications mentioned herein are cited for the purpose of describing and disclosing the 

15 cell lines, protocols, reagents and vectors which are reported in the publications and which might be 
used in connection with various embodiments of the invention. Nothing herein is to be construed as an 
admission that the invention is not entided to antedate such disclosure by virtue of prior invention. 
DEFINITIONS 

"KPP" refers to the amino acid sequences of substantially purified KPP obtained from any 
20 species, particularly a manomalian species, including bovine, ovine, porcine, murine, equine, and human, 
and from any source, whether natural, synthetic, semi-synthetic, or recombinant. 

The term "agonist" refers to a molecule which intensifies or mimics the biological activity of 
KPP. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of KPP either by directly interacting with KPP 
25 or by acting on components of the biological pathway in which KPP participates. 

An "allelic variant" is an alternative form of the gene encoding KPP. Allelic variants may 
result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
polypeptides whose structure or function may or may not be altered, A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
30 allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the others, one or more times 
in a given sequence. 

"Altered" nucleic acid sequences encoding KPP include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as KPP or a 
35 polypeptide with at least one functional characteristic of KPP. Included within this definition are 

24 



PF-I724P 

polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe of 
die polynucleotide encoding KPP, and improper or unexpected hybridization to allelic variants, with a 
locus other than the normal chromosomal locus for the polynucleotide encoding KPP. Hie encoded 
protein may also be "altered," and may contain deletions, insertions, or substitutions of amino acid 

5 residues which produce a silent change and result in a functionally equivalent KPP. Delibemte amino 
acid substitutions may be made on the basis of one or more similarities in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the biological 
or immunological activity of KPP is retained For example, negatively charged amino acids may 
include aspartic acid and glutamic acid, and positively charged amino acids may include lysine and 

10 arginine. Amino acids with uncharged polar side chains having similar hydrophilicity values may 

include: asparagine and glutamine; and serine and threonine. Amino acids with uncharged side chains 
having similar hydrophilicity values may include: leucine, isoleucine, and valine; glycine and alanine; 
. and phenylalanine and tyrosine. 

The terms "amino acid" and "airiino acid sequence" can refer to an oligopeptide, a peptide, a 

15 polypeptide, or a protein sequence, or a fragment of any of these, and to naturally occurring or 
synthetic molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally 
occurring protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid 
sequence to the complete native amino acid sequence associated with the recited protein molecule. 

"Amplification" relates to the production of additional copies of a nucleic acid. Amplification 

20 may be carried out using polymerase chain reaction (PGR) technologies or other nucleic acid 
amplification technologies well known in the art. 

The term "antagonist" refers to a molecule which inhibits or attenuates the biological activity 
of KPP. Antagonists may include proteins such as antibodies, anticalins, nucleic acids, carbohydrates, 
small molecules, or any other compound or composition which modulates the activity of KPP either by 

25 directiy interacting with KPP or by acting on components of the biological pathway in which KPP 
participates. 

The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 

thereof, such as Fab, F(ab')2, and Fv fragments, which are capable of binding an epitopic determinant 
Antibodies that bind KPP polypeptides can be prepared using intact polypeptides or using fragments 

30 containing small peptides of interest as the immunizing antigen. The polypeptide or oligopeptide used 
to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, 
or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used 
carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and 
keyhole limpet hemocyanin (KLH). The coupled peptide is dien used to immunize the animal. 

35 Hie term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 
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makes contact with a particular antibody. When a protein or a fragment of a piotein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
which bind specifically to antigenic detmninants (particular regions or three-dimensional structures on 
the protein). An antigenic determinant may compete with the intact antigen (i.e., the inmiunogen used 
5 to elicit the inunune response) for binding to an antibody. 

The term "aptamer'* refers to a nucleic add or oligonucleotide molecule that binds to a 
specific molecular target Aptamers are derived from an in vitro evolutionaiy process (e.g., SELEX 
(Systematic Evolution of Ligands by Exponential Enrichment), described in U.S. Patent No. 
5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. 

10 Aptamer compositions may be double-stranded or single-stranded, and may include 

deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. The 
nucleotide components of an aptamer may have modified sugar groups (e.g., the 2 -OH group of a 
ribonucleotide may be replaced by 2'-F or 2 -NHj), which may improve a desired property, e.g,, 
resistance to nucleases or lon^r lifetime in blood. Aptamers may be conjugated to other molecules, 

15 e.g., a hi^ molecular weight carrier to slow clearance of the aptamer from the circulatory system. 
Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a 
cioss-linker (Brody, E.N. and L. Gold (2000) J. Biotechnol. 74:5-13). 

The term "intramer" refers to an aptamer which is expressed in vivo. For example, a 
vaccinia virus-based RNA expression system has been used to express specific RNA aptamers at 

20 higji levels in the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. Nad. Acad. Sci. USA . 
96:3606-3610). 

The term "spiegelmef ' refers to an aptamer which mcludes L-DNA, L-RNA, or other left- 
handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 
nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 

25 substrates containing right-handed nucleotides. 

The term "antisense** refers to any composition capable of base-pairing with the "sense" 
(coding) strand of a polynucleotide having a specific nucleic acid sequence. Antisense compositions 
may include DNA; RNA; peptide nucleic acid (PNA); oligonucleotides having modifi^ backbone 
linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides 

30 having modified sugar groups such as 2 -methoxyethyl sugars or 2*-methoKyethoxy sugars; or 
oligonucleotides having modified bases such as 5-methyl cytosine, 2 -deoxyuracil, or 7-dea2a-2 - 
deoxyguanosine. Antisense molecules may be produced by any method including chemical synthesis 
or transcription. Once introduced into a cell, the complementary antisense molecule base-pairs with a 
naturally occurring nucleic acid sequence produced by die cell to form duplexes which block either 

35 transcription or translation. The designation "negative" or •*minus** can refer to the antisense strand. 
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and Ae designation **positive" or ^ius" can refer to die sense strand of a reference DNA molecule. 

The term '^biologically active" refers to a protein having structural, regulatory, or biochemical 
functions of a naturally occurring molecule. Likewise, "immunologically active'' or "immunogenic** 
refers to the capability of the natural, recombinant, or synthetic KPP, or of any oligopeptide thereof, to 
5 induce a specific immune response in appropriate animals or cells and to bind with specific antibodies. 

"Complementary" describes the relationship between two single-stranded nucleic acid 
sequences that anneal by base-pairing. For example, 5-AGT-3' pairs with its complement, 
3'-TCA.5\ 

A "composition comprising a given polynucleotide" and a "composition comprising a given 

10 polypeptide" can refer to any composition containing the given polynucleotide or polypeptide. The 
composition may comprise a dry formulation or an aqueous solution. Compositions comprising 
polynucleotides encoding KPP or fragments of KPP may be employed as hybridization probes. Hie 
probes may be stored in freeze-dried form and may be associated with a stabilizing agent such as a 
carbohydrate. In hybridizations, the probe may be deployed in an aqueous solution containing salts 

15 (e.g., NaCl), detergents (e.g., sodium dodecyl sulfate; SDS), and other components (e.g., Denhardt's 
solution, dry milk, salmon sperm DNA, etc.). 

'X^onsensus sequence" refers to a nucleic acid sequence which has been subjected to 
repeated DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Biosystems, Foster City CA) in the 5' and/or the 3* direction, and resequenced, or which has been 

20 assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 
program for fragment assembly, such as the GELVIEW fragment assembly system (Accelrys, 
Burlington MA) or Phrap (University of Washington, Seattle WA). Some sequences have been both 
extended and assembled to produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that are predicted to least 

25 interfere with the properties of the original protein, i.e., the structure and especially the function of the 

protein is conserved and not significantiy changed by such substitutions. The table below shows amino 

acids which may be substituted for an original amino acid in a protein and which are regarded as 

conservative amino acid substitutions. 

Original Residue Conservative Substitution 

30 Ala Oly,Ser 

Arg His, Lys 

Asn Asp, <31n. His 

Asp Asn, Glu 

Cys Ala, Ser 

35 Gin Asn, Glu, His 

Glu Asp, Gin, His 

Gly Ala 
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His Asn, Arg, Gin, GIu 

ne Leu, Val 

Leu He, Val 

Lys Arg,Gln,Glu 

5 Met Leu, He 

Phe IBs, Met. Leu, Tip, l^r 

Ser Cys,Thr 

Tiff Ser, Val 

Trp Phe, Tyr 

^° IVr His.Phe.-np 

Be. Leu. Thr 

Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 
backbone in die area of die substitution, for example, as a beta sheet or alpha helical conformation, 
15 (b) die charge or hydrophobicity of die molecule at die site of die substitution, and/or (c) die bulk of 
the side chain. ' 

A "deletion" refers to a change in die amino acid or nucleotide sequence dmt results in die 
absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide. 
20 Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
alkyl. acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains 
at least one biological or immunological function of die natural molecule. A derivative polypeptide is 
one modified by glycosylation, pegylation, or any similar process diat retains at least one biological or 
immunological function of the polypeptide from which it was derived. 
25 A "detectable label" refers to a reporter molecule or enzyme diat is capable of generating a 

measurable signal and is covalendy or noncovalently joined to a polynucleotide or polypeptide. 

"Differential expression" refers to increased or upregulated; or decreased, downregulated, or 
absent gene or protein expression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
30 diseased and a normal sample. 

•TExon shuffling" refers to die recombination of different coding regions (exons). .Since.an_ 
exon may represent a stinctiiral or functional domain of die encoded protem, new proteins may be 
assembled dirough die novel reassortment of stable substructures, dius allowing acceleration of die 
evolution of new protein functions. 
15 A "fragment" is a unique portion of KPP or a polynucleotide encoding KPP which can be 

identical in sequence to. but shorter in length dian. die parent sequence. A fragment may comprise up 
to die entire lengdi of die defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise from about 5 to about 1000 contiguous nucleotides or amino acid residues. A 
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fragment used as a probe, primer, antigen, therapeutic molecule, or for other puiposes, may be at least 
S, 10, IS, 16, 20, 2S, 30, 40, SO, 60, 75, 100, ISO, 2S0 or at least SOO contiguous nucleotides or amino 
acid residues in length. Fragments may be preferentially selected firam certain re^ons of a molecule. 
For example, a polypq)tide fragment may comprise a certain lengdi of contiguous amino acids 
5 selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a 
certain defined sequence. Cleariy these lengths are exemplary, and any length that is supported by ttie 
specification, including the Sequence Listing, tables, and figures, may be encompassed by the present 
embodiments. 

A fragment of SEQ ID NO: 16-30 can comprise a region of unique polynucleotide sequence 
10 that specifically identifies SEQ ID NO: 16-30, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SEQ ID NO: 16-30 can be employed 
in one or more embodiments of methods of the invention, for example, in hybridization and 
amplification technologies and in analogous mettiods that distinguish SEQ ID NO:16-30 from xelated 
polynucleotides. The precise length of a fragment of SEQ ID NO:16-30 and die le^on of SEQ ID 
15 NO: 16-30 to which the fragment corresponds are routinely determinable by one of ordinary skill in the 
art based on the intended purpose for flie fragment 

A fragment of SEQ ID NO: 1-15 is encoded by a fragment of SEQ ID NO: 16-30. A 
fragment of SEQ ID NO:l-15 can comprise a region of unique amino acid sequence that specifically 
identifies SEQ ID NO: 1-15. For example, a fragment of SEQ ID NO: 1-15 can be used as an 
20 inununogenic peptide for the development of antibodies that specifically recognize SEQ ID NO: 1-15. 
The precise length of a fragment of SEQ ID NO:l-15 and the region of SEQ ID NO:l-15 to which 
the fragment corresponds can be determined based on the intended purpose for the fragment using 
one or more analytical methods described herein or otherwise known in the art. 

A **full length" polynucleotide is one containing at least a translation initiation codon (e.g., 
25 methionine) followed by an open reading fiame and a translation termination codon. A "full length" 
polynucleotide sequence encodes a "fiiU length" polypeptide sequence. 

"Homology" refers to sequence similarity or, alternatively, sequence identity, between two or 
more polynucleotide sequences or two or more polypeptide sequences. 

The terms **peicent identity" and "% identity," as applied to polynucleotide sequences, refer to 
30 the percentage of identical nucleotide matches between at least two polynucleotide sequences aligned 
using a standardized algorithm. Such an algoritiim may insert, in a standardized and reproducible way, 
gaps in the sequences being compared in order to optimize alignment between two sequences, and 
therefore achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using one or more 
35 computer algorithms or progmms known in the art or described herein. For example, percent identity 
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can be detennined using the default parameters of the CLUSTAL V algorithm as incorporated into 
the MEGALIGN v^ion 3.12e sequence alignment program. Hds program is part of the 
LASiERGENE software package,, a suite of molecular biological analysis programs (DNASTAR, 
Madison WI). CLUSTAL V is described in HBggins, D.G. and P.M. Sharp (1989; CABIOS 5:151- 
5 153) and in Higgins, D-G. et al. (1992; CABIOS 8:189-191). For pairwise alignments of 
polynucleotide sequences, the default parameters are set as follows: Ktuple=:2, gap penalty=5, 
window=4, and "diagonals saved*'=34. The 'Sveighted" residue weight table is selected as flie default. 

Alternatively* a suite of commonly used and fineely available sequence comparison algorithms 
which can be used is provided by the National Center for Biotechnology Infcnmadon (NCBI) Basic 
10 Local Alignment Search Tool (BLAST) (Altschul, S.F. et al. (1990) J- Mol. Biol. 215:403-410), which 
is available from several sources, including the NCBI, Bethesda, MD, and on the Litemet at 
ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis programs 
including "blastn," that is used to align a known polynucleotide sequence with other polynucleotide 
sequences from a variety of databases. Also available is a tool called **BLAST 2 Sequences** that is 
15 used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 Sequences*' can be 
accessed and used interactively at ncbi.nlra.nih.gov/gorf/bl2.html. The "BLAST 2 Sequences** tool 
can be used for both blastn and blastp (discussed below). BLAST programs are commonly used with 
gap and other parameters set to default settings. For example, to compare two nucleotide sequences, 
one may use blastn with the "BLAST 2 Sequences** tool Version 2.0.12 (April-21-2000) set at default 
20 parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 

Reward for match: 1 

Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 
25 Gap X drop-off: 50 

Expect: 10 

Word Size: 11 

Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 
30 as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least SO, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported 
by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to describe a 
35 length over which percentage identity may be measured. 
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Nucleic add sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of die genetic code. It is understood that changes 
in a nucleic acid sequence can be made using diis degeneracy to produce multiple nucleic acid 
sequences that all encode substantially die same protein. 
5 nie phrases t>ercent identity" and "% identity." as applied to polypeptide sequences, refer to 

die percentage of identical residue matches between at least two polypeptide sequences aUgned using 
a standardized algorithm. Methods ofpolypeptide sequence alignment are well-known. Some 
aHgnment mediods take into account conservative amino acid substitutions. Such conservative 
substitutions, explained in more detail above, genendly preserve die charge and hydrophobicity at die 

10 site of substitution, tiius preserving die structure (and tiierefore function) of die polypeptide. The 
phrases "percent similarity" and "% sinrilarity." as applied to polypeptide sequences, refer to die 
percentage of residue matches, including identical residue matches and conservative substitutions, 
between at least two polypeptide sequences aligned using a standardized algoridun. In contrast, 
conservative substitutions are not included in die calculation of percent identity between polypeptide 

IS sequences. 

Percent identity between polypeptide sequences may be determined using die default 
parameters of die CLUSTAL V algoridim as incorporated into die MEGAUGN version 3.12e 
sequence alignment program (described and referenced above). For pairwise alignments of 
polypeptide sequences using CLUSTAL V. die defoult parameters are set as foUows: Ktuple=l , gap 
20 penalty=3, window=5. and "diagonals saved"=5. THe PAM250 matrix is selected as die default 
residue weight table. 

Alternatively die NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use die "BLAST 2 Sequences" tool Version 
2.0.12 (April.21-2000) widi blastp set at default parameters. Such default parameters may be. for 
25 example: 

Matrix: BLOSUM62 

Open G<g>: 11 and Extension Gap: 1 penalties 

Gap X drop-off: 50 ' " " " — " 

Expect: 10 
30 . Word Size: 3 

Filter: on 

Pfcrcent identity may be measured over die lengdi of an entire defined polypeptide sequence, 
for example, as defined by a particular SEQ ID number, or may be measured over a shorter lengtii. 
for example, over die lengdi of a fragment taken from a larger, defined polypeptide sequence, for 
instance, a fragment of at least 15, at least 20. at least 30, at least 40. at least 50. at least 70 or at least 
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ISO conttguous residues. Such lengOis are exemplary only» and it is understood that any fragment 
length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
used to describe a length over which percentage identity may be measured. 

'*Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
5 DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term "humanized antibody" refers to an antibody molecule in which the amino acid 
sequence in die non-antigen binding regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

10 ^Hybridization" refers to the process by which a polynucleotide strand anneals witfi a 

complementary strand through base pairing under defined hybridization conditions. Specific 
hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
after die "washing" step(s). The washing step(s) is particulariy important in determinmg die 

15 stringency of the hybridization process, with more strmgent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfecdy matched. Permissive 
conditions for aimealing of nucleic acid sequences are routinely determinable by one of ordinary skill in 
the art and may be consistent among hybridization expmments, whereas wash conditions may be 
varied among experiments to achieve die desired stringency, and tiierefore hybridization specificity. 

20 Permissive annealing conditions occur, for exanq>Ie, at 68**C in die presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 fig/ml sheared, denatured salmon sperm DNA. 

Generally, stringency of hybridization is expressed, in part, witfi reference to the temperature 
under which the wash step is carried out Such wash temp^tures are typically selected to be about 
to 2XrC lower tiian die thermal melting point (T J for die specific sequence at a defined ionic 

25 strength and pH. The T„ is the temperature (under defined ionic strength and pH) at which 50% of 
the target sequence hybridizes to a perfecdy matched probe. An equation for calculating T„ and 
conditions for nucleic acid hybridization are well known and can be found in Sambrook, J. and D.W. 
Russell (2001; Molecular Cloning: A L aboratory ManuaL 3rd ed., vol. 1-3, CoW Spring Harbo7Press, 
Cold Spring Harbor NY, ch. 9). 

30 High stringency conditions for hybridization between polynucleotides of the present invention 

include wash conditions of eS^'C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Alternatively, temperatures of about 65°C. 60^C, 55**C, or 42*'C may be used. SSC concentration may 
be varied from about 0.1 to 2 x SSC, widi SDS being present at about 0.1%. TVpically, blocking 
rea^nts are used to block non-specific hybridization. Such blocking reagents include, for instance, 

35 sheared and denatured salmon sperm DNA at about 100-200 ^g^ml. Organic solvent, such as 
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fonnainide at a concentration of about 35-50% v/v, may also be used under particular circumstances, 
such as for RNA J>NA hybridizations. Useful variations on these wash conditions wiU be readily 
apparent to those of ordinary skiU in the art. Hybridization, particularly under high stringency 
conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity i& 

5 strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

Hie term "hybridization complex" refexs to a complex formed between two nucleic acids by 
virtue of the foimation of hydrogen bonds between ccmqilementaiy bases. A hybridization con^lex 
may be formed in solution (e.g.. C^t or Rot analysis) or formed between one nucleic acid present in 
solution and another nucleic acid immobilized on a solid support (e.g., paper, membranes, filters, chips, 

10 pins or glass slides, or any otfier appropriate substrate.to which ceUs or then- nucleic acids have been 
fixed). 

The words "insortion" and "addition** refer to chanf^ in an amino acid or polynucleotide . 
sequence resulting in tfie addition of one or more ammo acid resides or nucleotides, respectively. 
"Immune response" can refer to conditions associated witii inflammation, trauma, immune 
15 disoniers. or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and otiier signaling molecules, which may affect 
cellular and systemic defense systems. 

An 'immunogenic fragment" is a polypeptide or oUgopeptide fragment of KPP which is 
capable of eUciting an inimune response when introduced into a living otganism, for example, a 
20 mammal. The tenn "fanmunogenic fragment" also includes any polypeptide or oHgopeptide fragment 
of KPP which is useful in any of die antibody production mediods disclosed herein or known in die art. 

The term '^nicroarray" refers to an arrangement of a plurali^ of polynucleotides, 
polypeptides, antibodies, or other chemical compounds on a substrate. 

The terms "element and "array element" refer to a polynucleotide, polypeptide, antibody, or 
25 other chemical compound having a unique and defined position on a microairay. 

The tenn "modulate" refers to a change in die activity of KPP. For example, modulation may 
cause an increase or a decrease in protein activity, binding characteristics, or any odier biological, 
functional,or immunological properties of KPP. 

TTie phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
30 polynucleotide, or any fragment tiiereof. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be sfaigle-stranded or double-stranded and may represent die sense or tiie 
antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

"Operably linked" refers to die situation in which a first nucleic acid sequence is placed in a 
functional relationship witii a second nucleic acid sequence. For instance, a promoter is operably 
35 linked to a coding sequence if die promoter affiects die transcription or expression of die coding 
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sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

"Pepti<te nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
5 amino acid residues ending in lysine. The terminal lysine confers solubitity to die composition. PNAs 
preferentially bind complementary smgle stranded DNA or RNA and stop transcript elongation, and 
may be pegylated to extend dieu: lifespan in the cell. 

"Post-translational modification" of an KPP may involve lipidation. glycosylation. 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the 
10 art These processes may occur syntheticaUy or biochemically. Biochemical modifications wiU vary 
by cell type depending on the enzymatic milieu of KPP. 

"Probe" refers to nucleic acids encoding KPP, their complements, or fragments thereof, 
which are used to detect identical. aUeUc or related nucleic acids. Ptobes are isolated oligonucleotides 
or polynucleotides attached to a detectable label or reporter molecule. Topical labels include 
15 radioactive isotopes, ligands. chemiluntinescent agents, and enzymes. "Primers" are short nucleic 
acids. uj?uaUy DNA oligonucleotides, which may be annealed to a target polynucleotide by 
complementary base-pairing. Hie primer may tfien be extended along tiie target DNA strand by a 
DNA polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic 
acid, e.g., by die polymerase chain reaction (PGR). 
20 Probes and primers as used in die present invention typically comprise at least 15 contiguous 

nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and prim^ diat conqnise at least 20, 25, 30, 40. 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of die disclosed nucleic add sequences. Ptobes and primers 
may be considerably longer dian ttiese examples, and it is understood diat any lengtii supported by the 
29 specification, including die tables, figures, and Sequence Listing, may be used. 

Mefliods for preparing and using probes and primws are described in, for example. Sambrook, 
J. and D.W. Russell (2001; Molecular aonine: A Lahnr»t«rv T^^r,^^.t 3rd ed., vol. 1-3, Cold Spring 
Hari)or Press, Cold Spring Harbor NY), Ausubel. F.M. etal. (1999: Shbrt Protocol.; h» Mnf^nW " 
BisISgy. 4* ed.. John Wiley & Sons. New York NY), and Ihnis, M. et al. (1990; PGR Protocols, A 
^ Guide tp Mgthftds and Applications Academic Press, San Diego CA). PCR primer pams can be 
derived from a known sequence, for example, by using computer piogiams intended for tfiat purpose 
such as Primer (Version 0.5, 1991. Whitehead Institute for Biomedical Research. Cambridge MA). 

Oligonucleotides for use as primers are selected using software known in die art for such 
purpose. For example. OUGO 4.06 software is useful for die selection of PCR primer pairs of up to 
35 100 nucleotides each, and for die analysis of oligonucleotides and larger polynucleotides of up to 5.000 
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nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection 
programs have incorporated additional features for expanded capabilities. For example, the PrimOU 
primer selection program (available to the public from the Genome Center at University of Texas 
South West Medical Center, Dallas TX) is capable of choosing specific primei^ from me^base 

5 sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS primer 
selection program (available to the public from the Whitehead Insdtute/MIT Center for Genome 
Research, Cambridge MA) allows the user to input a '*mispriming library/* in which sequences to 
avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of 
oligonucleotides for microarrays. (The source code for the latter two primer selection programs may 

10 also be obtained from their respective sources and modified to meet the user's specific needs.) The 
PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource 
Centre» Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing 
selection of primers that hybridize to either die most conserved or least conserved regions of aligned 
nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved 

15 oligonucleotides and polynucleotide fragments. TTie oligonucleotides and polynucleotide fragments 
identified by any of the above selection methods are useful in hybridization technologies, for example, 
as PGR or sequencing primers, microarray elements, or specific probes to identify fiilly or partially 
complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are 
not limited to those described above. 

20 A "tecombinant nucleic acid" is a nucleic acid that is not naturally occurring or has a 

sequence that is made by an artificial combination of two or more otherwise separated segments of 
sequence. This artificial combination is often accomplished by chemical synthesis or, more conunonly, 
by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering 
techniques such as those described in Sambrook and Russell {supray The term recombinant includes 

25 nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the 
nucleic acid. Frequentiy, a recombinant nucleic acid may include a nucleic acid sequence operably 
linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, 
for example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 

30 vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 

A ""regulatory elemenf ' refers to a nucleic acid sequence usually derived from untranslated 
regions of a gene and includes enhancers, promoters, introns, and 5* and 3' untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 

35 translation, or RNA stability. 
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•Reporter molecules" are chemical of biochemical moieties used for labeling a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
cbemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in die art. 
5 An "RNA equivalent,** in reference to a DNA molecule, is composed of die same linear 

sequence of nucleotides as die reference DNA molecule with the exception that all occuirences of 
die nitrogenous base tfiymine are replaced widi uracil, and die sugar backbone is composed of ribose 
instead df deoxyribose. 

The term "sample** is used in its broadest sense. A sample suspected of containing KPP, 

10 nucleic acids encoding KPP, or fragments diCTeof may comprise a bodily fluid; an extract from a ceU, 
chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, in 
solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms "specific binding" and "specifically binding* refer to fliat interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 

15 syndietic binding composition. The interaction is dependent upon the presence of a particular structure 
of the protein, e.g., the antigenic deteraiinant or epitope, recognized by die binding molecule. For 
example, if an antibody is specific for epitope "A,** the presence of a polypeptide comprising the 
epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the 
antibody will reduce die amount of labeled A diat binds to the antibody. 

20 The term "substantially purified" refers to nucleic acid or amino acid sequences diat are 

removed from their natural environment and are isolated or separated, and are at least about 60% 
free, preferably at least about 75% free, and most preferably at least about 90% free from other 
components with which they are naturally associated. 

A "substitution** refers to die replacement of one or more amino acid residues or nucleotides 

25 by different anoiino acid residues or nucleotides, respectively. 

"Substrate** refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. The sulistrate can have a variety of surface fom^, such as wellsT 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

30 A "transcript image** or "expression profile** refers to die collective pattern of gene expression 

by a particular cell type or tissue under given conditions at a given time. 

*Ttensformation** describes a process by which exogenous DNA is introduced into a recipient 
cell. Transformation may occur under natural or artificial conditions according to various metiiods 
well known in die art. and may rely on any known mediod for die insertion of foreign nucleic acid 

35 sequences into a prokaryotic or eukaryotic host cell. The mediod for transformation is selected based 
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on the type of host ceU being transformed and may include, but is not limited to, bacteriophage or viial 
infection, electroporation, heat shock, lipofection, and particle bombardment The term 'transformed 
cells*' includes stably transformed cells in which the inserted DNA is capable of replication eidier as 
an autonomously replicating plasmid or as part of the host chromosome, as well as transiently 
S transformed cells which express the inserted DNA or RN A for limited periods of time. 

A "transonic organism," as used herein, is any organism, including but not limited to animals 
and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 
introduced by way of human intervention, such as by transgenic techniques well known in the art The 
nucleic acid is introduced into the cell, directiy or indirecdy by introduction into a precursor of the cell, 
10 by way of deliberate genetic manipulation, such as by microinjection or by infection with a 

recombinant virus. In another embodiment, the nucleic acid can be introduced by infection with a 
recombinant viral vector, such as a lentiviral vector (Lois, C. et al. (2002) Science 295:868-872), The 
term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather 
is directed to the introduction of a recombinant DNA molecule. The transgenic organisms 
15 contemplated in accordance with the present invention include bacteria, cyanobacteria, fiingi, plants 
and animals. The isolated DNA of the present invention can be introduced into the host by methods 
known in the art, for example infection, transfection, transformation or transconjugation. Techniques 
for transferring the DNA of the present invention into such organisms are widely known and provided 
in references such as Sambrook and Russell (supra). 
20 A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 

at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 
25 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defined length. A variant may be described as, for example, an 
"allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have 
significant identity to a reference molecule, but will generally have a greater or lesser number of 
polynucleotides due to alternate splicing during mRNA processing. The corresponding polypeptide 
30 may possess additional functional domains or lack domains that are present in the reference molecule. 
Species variants are polynucleotides that vary from one species to another. The resulting polypeptides 
will generally have significant amino acid identity relative to each other. A polymorphic variant is a 
variation in the polynucleotide sequence of a particular gene between individuals of a given species. 
Polymorphic variants also may encompass "single nucleotide polymorphisms" (SNPs) in which the 
35 polynucleotide sequence varies by one nucleotide base. The presence of SNPs may be indicative of, 
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for example, a certain population, a disease state, or a propensity for a disease state. 

A * Variant* of a particular polypeptide sequence is defined as a polypeptide sequence having 
at least 40% sequence identity or sequence similarity to the particular polypeptide sequence over a 
certain length of one of the polypeptide sequences using blastp witii the "BLAST 2 Sequences'* tool 
5 Version 2.0.9 (May-07-1999) set at default parameters. Such a pair of polypeptides may show, for 
example, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 
91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, 
or at least 99% or greater sequence identic or sequence sinriilarity over a certain defined lengdi of one 
of the polypeptides. 

10 

THE INVENTION 

Various embodiments of the invention include new human kinases and phosphatases (KPP), 
the polynucleotides encoding KPP, and the use of these compositions for the diagnosis, treatment, or 
prevention of cardiovascular diseases, immune system disordm, neurological disorders, disorders 
IS affecting growth and development, lipid disorders, cell proliferative disordm, and cancers. 

Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
embodiments of die invention. Each polynucleotide and its corresponding polypeptide are correlated to 
a single Incyte project identification number (hicyte Project ID). Each polypeptide sequence is 
denoted by both a polypeptide sequence identification number (Polypeptide SEQ ED NO:) and an 
20 Incyte polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide 

sequence is denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID 
NO:) and an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. 

Table 2 shows sequences with homology to polypeptide embodiments of the invention as 
identified by BLAST analysis against the GenBank protein (genpept) database and the PROTEOME 
25 database. Columns 1 and 2 show the polypeptide sequence identification number (Polypeptide SEQ 
. ID NO:) and the corresponding Incyte polypeptide sequence number (Incyte Polypeptide ID) for 
polypeptides of the invention. Colunm 3 shows the GenBank identification number (GenBank ID NO:) 
of the nearest GenBank homolog and the PROTEOME database identification nuiribers 
(PROTEOME ID NO:) of the nearest PROTEOME database homologs. Column 4 shows the 
30 probability scores for the matehes between each polypeptide and its homoIog(s). Colunm S shows the 
annotation of tfie GenBank and PROTEOME database homolog(s) along with relevant citations where 
applicable, all of which are expressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Columns 1 and 
2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding Incyte 
35 polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. Colurrm 
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3 shows the number of amino acid residues in each polypeptide. Colunm 4 shows amino acid residues 
comprising signatuie sequences, domains, motifs, potential phosphorylation sites, and potential 
glycosylation sites. Colunm 5 shows analytical methods for protein structuie/function analysis and in 
some cases, searchable databases to which the analytical methods were applied. 
5 Together, Tables 2 and 3 sununarize the properties of polypeptides of die invention, and these 

properties establish that die claimed polyp^tides are kinases and phosphatases- For example, SEQ 
ID NO:6 is 93% identical, firom residue E39 to residue 1490, to human multifunctional 
calcium/calmodulin-dependent protein kinase n delta2 isofonn (GenBank ED g4426S9S) as determined 
by die Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 

10 9.0E-255, which indicates the probability of obtaining the observed polypeptide sequence alignment by 
chance. SEQ ID NO:6 also has homology to calcium-cafanodulin dependent protein kinase n delta, a 
memb^ of tfie multifunctional CAMKH family involved in Ca2+ regulated processes, of which die 
alternative form delta 3 is specifically upregulated m the myocardium of patients with heart failure, as 
determined by BLAST analysis using die PROTEOME database. SEQ ID NO:6 also contains a 

15 protein kinase domain and a serine/threonine protein kinase catalytic domain as determined by 

searching for statistically significant matches in die hidden Markov model (HMM>based PFAM and 
SMART databases of conserved protein families/domains. (See Table 3.) Data from BLIMPS. 
MOTIFS, and PROFILESCAN analyses, and BLAST analyses against die PRODOM and DOMO 
databases, provide furdier corroborative evidence that SEQ ID NO:6 is a calcium-calmodulin 

20 dependent protein kinase. The foregoing provides evidence that SEQ ID NO:6 is a calcium- 
. calmodulin dependent protein kinase. SEQ ID NO:l-5 and SEQ ID NO:7-15 were analyzed and 
annotated in a similar manner. The algorithms and parameters for the analysis of SEQ ID NO: 1-15 
are described in Table 7. 

As shown in Table 4, full length polynucleotide embodiments were assembled using cDNA 

25 sequences or coding (exon) sequences derived from genomic DNA, or any combination of these two 
types of sequences. Column 1 lists the polynucleotide sequence identification number (Polynucleotide 
SEQ ID NO:), the corresponding Incy te polynucleotide consensus sequence number (Incyte ID) for 
each polynucleotide of die invention, and the lengdi of each polynucleotide sequence in basepairs. 
Column 2 lists fragments of die polynucleotides which are useful,' for example, in hybridization or 

30 amplification technologies diat identify SEQ ID NO: 16-30 or diat distinguish between SEQ ID 
NO:16-30 and related polynucleotides. Column 3 shows identification numbers corresponding to 
cDNA sequences, coding sequences (exons) predicted from genomic DNA, and/or sequence 
assemblages comprised of both cDNA and genomic DNA. These sequences were used to assemble 
die full lengdi polynucleotide embodiments. Colunms 4 and 5 of Table 4 show die nucleotide start (50 

35 and stop (3') positions of the cDNA and/or genomic sequences in column 3 relative to tiieir respective 
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fiill length sequences. 

The identification numbers in Coliunn 3 of Table 4 may refer specificaUy , for example, to 
Incyte cDNAs along witfi their corresponding cDNA libraries. For example, 2944771F7 is the 
identification number of an Ihcyte cDNA sequence, and BRAITUT23 is tiie cDNA library from 

S which it is derived. Incyte cDNAs for which cDNA libraries are not indicated were derived from 
pooled cDNA libraries (e.g., 72678960V1). Alternatively, tfie identification numbers in column 3 may 
refer to GenBank cDNAs or ESTs (e.g., g3422499) which contributed to the assembly of the full 
length polynucleotides. In addition, the identification numbers in column 3 may identify sequences 
derived from the ENSEMBL (The Sanger Centre, Cambridge, UK) database (Le.^ diose sequences 

10 including the designation *'Et9ST*). Alternatively, the identification numbers in colunm 3 may be 
derived from the NCBI RefSeq Nucleotide Sefquence Records Database (Le., tfiose sequences 
including the designation ''MM'* or ^'NT') or die NCBI RefSeq Ftotein Sequence Records (i.e., ttiose 
sequences including the designation ^'NF*)* Alternatively, the identification numbers in column 3 may 
refer to assCTiblages of both cDNA and Genscan-predicted exons brought together by an "exon 

15 stitching" algoritfmi. For example, VLJOOaoaj^fJsf2jnnnnrjJ^4 represents a "stitched" 
sequence in which XXXXXX is the identification number of the cluster of sequences to which the 
algorithm was applied, and YYYYYis the number of the prediction generated by die algorithm, and 
if present, represent specific exons that may have been manually edited during analysis (See 
Example V). Alternatively, the identification numbers in column 3 may refer to assemblages of exons 

20 brought together by an "exon-stretching'* algorithm. For example, 

FLXXXXXX^^gAAAAA^BBBB^lJ^ is the identification number of a "stretched" sequence, with 
XXXXXX being the Incyte project identification number, gAAAAA being the GenBank identification 
number of the hunmn genomic sequence to which the "exon-stretching" algorithm was applied, 
gPBBBB being die GenBank identification number or NCBI RefSeq identification number of the 

25 nearest GenBank protein homolog, and N refraring to specific exons (See Example V). In instances 
where a RefSeq sequence was used as a protein homolog for the "exon-stretqhing" algorithm, a 
RefSeq identifier (denoted by *mi," •'NP," or "NT") may be used in place of the GenBank identifier 
(i.^„ gBBBBB). ' ' 

Alternatively, a prefix identifies component sequences that were hand-edited, pre^cted from 

30 genomic DNA sequences, or derived from a combination of sequence analysis methods. The 

following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 



Type of analysis and/or examples of programs 
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GNN, GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, The Sanger Centre, Cambridge, UK)- 


GBI 


Hand-edited analysis of genomic sequences. 


xrL 


Stitched or stretched genomic sequences (see Example V). 


INCY 


Full length transcript and exon prediction firom mapping of EST 
sequences to the genome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript. 



10 



15 



In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in 
Table 4 was obtained to confinn the final consensus polynucleotide sequence, but die relevant Incyte 
cDNA identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those fuU length polynucleotides which 
were assembled using Incyte cDNA sequences. The representative cDNA library is tfie Incyte 
cDNA library which is most ftequently represented by the Incyte cDNA sequences wWch were used 
to assemble and confirm the above polynucleotides. Hie tissues and vectors which wctb used to 
construct the cDNA libraries shown in Table 5 are described in Table 6. 

Table 8 shows single nucleotide polymorphisms (SNPs) found in polynucleotide sequences of 
the invention, along with aUele frequencies in different human populations. Columns 1 and 2 show the 
polynucleotide sequence identification number (SEQ ID NO:) and the corresponding Incyte project 
identification number (PID) for polynucleotides of the invention. Column 3 shows the liicyte 
identification number for die EST in which the SNP was detected (EST ID), and column 4 shows the 
20 identification number for die SNP (SNP ID). Column 5 shows die position widiin die EST sequence 
at which die SNP is located (EST SNP), and column 6 shows die position of die SNP widiin die fiiU- 
lengdi polynucleotide sequence (CBl SNP). Column 7 shows die allele found in die EST sequence. 
Columns 8 and 9 show die two alleles found at die SNP site. Column 10 shows die amino acid 
encoded by die codon including die SNP site, based upon tfie allele found in die EST. Golumns^M4 
show the frequency of aUele 1 in four different human populations. An entry of n/d (not detected) 
indicates diat die frequency of allele 1 in die pcqiulation was too low to be detected. whUe n/a (not 
available) indicates diat die allele frequency was not determined for die population. 

TTie invention also encompasses KPP variants. Various embodiments of KPP variants can 
have at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 
90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 
95%, at least about 96%. at least about 97%, at least about 98%, or at least about 99% amino acid 
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sequence identity to the KPP amino acid sequence, and can contain at least one IFunctional or 
structural characteristic of KPP. 

Various embodiments also racompass polynucleotides which encode KPP. In a particular 
embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected 

5 from the group consisting of SEQ ID NO: 16-30, which encodes KPP. The polynucleotide sequences 
of SEQ ID NO: 16-30, as presented in the Sequence Listing, embrace the equivalent RNA sequences, 
wherein occurences of the nitrogenous base diymine are replaced with uracil, and the sugar 
backbone is composed of ribose instead of deoxyribose. 

The invention also encompasses variants of a polynucleotide encoding KPP. In particular, 

10 such a variant polynucleotide will have at least about 70%, at least about 75%, at least about 80%, at 
least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at 
least about 94%, at least about 95%, at least about 96%, at least about 97% at least about 98%, or at 
least about 99% polynucleotide sequence identity to a polynucleotide encoding KPP. A particular 
aspect of the invention encompasses a variant of a polynucleotide comprising a sequence selected 

IS from the group consisting of SEQ ID NO: 16-30 which has at least about 70%, at least about 75%, at 
least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at 
least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at 
least about 98%, or at least about 99% polynucleotide sequence identity to a nucleic acid sequence 
' selected from the group consisting of SEQ ID NO:16-30. Any one of the polynucleotide variants 

20 described above can encode a polypeptide which contains at least one functional or structural 
characteristic of KPP. 

In addition, or in the alternative, a polynucleotide variant of die invention is a splice variant of a 
polynucleotide encoding KPP. A splice variant may have portions which have significant sequence 
identity to a polynucleotide encoding KPP, but will generally have a greater or less^ number of 

25 nucleotides due to additions or deletions of blocks of sequence arising from alternate splicing during 
mRNA processing. A splice variant may have less than about 70%, or alternatively less than about 
60%, or alternatively less than about 50% polynucleotide sequence identity to a polynucleotide 
encoding KPP over its entire length; however, portions of the splice variant will have at least about 
70%, or alternatively at least about 85%, or alternatively at least about 95%, or alternatively 100% 

30 polynucleotide sequence identity to portions of the polynucleotide encoding KPP. For example, a 
polynucleotide comprising a sequence of SEQ ID NO: 19 and a polynucleotide comprising a sequence 
of SEQ ID NO:20 are splice variants of each other; a polynucleotide comprising a sequence of SEQ 
ID NO:21 and a polynucleotide comprising a sequence of SEQ ID NO:22 are splice variants of each 
other; and a polynucleotide comprising a sequence of SEQ ID NO:23 and a polynucleotide comprising 

35 a sequence of SEQ ID NO:24 are splice variants of each other. Any one of the splice variants 
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described above can encode a polypeptide which contains at least one functional or structural 
characteristic of KPP. 

It will be appreciated by those skilled in the art tiiat as a result of the degeneracy of the 
genetic code, a multihide of polynucleotide sequences encoding KPP, some bearing minimal similariQr 
5 to the polynucleotide sequences of any known and naturally occurring gene, may be produced. Thus, 
the invention contemplates each and every possible variation of polynucleotide sequence that could be 
made by selectmg combinations based on possible codon choices. These combinations are made in 
accordance with the standard triplet genetic code as applied to tfie polynucleotide sequence of 
naturally occuning KPP, and all such variations are to be considered as being specifically disclosed. 

10 Although polynucleotides which encode KPP and its variants are generally capable of 

hybridizing to polynucleotides encoding naturally occuning KPP under appropriately selected 
conditions of stringency, it may be advantageous to produce polynucleotides encoding KPP or its 
derivatives pdssessing a substantially different codon usage, e.g., inclusion of non-naturally occurring 
codons. Codons may be selected to increase the rate at which expression of the peptide occurs in a 

15 particular prokaryotic or eukaryotic host in accordance with the frequency wifli which particular 
codons are utilized by the host Other reasons for substantially altering the nucleotide sequence 
encoding KPP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties, such as a greater half-life, than 
transcripts produced from the naturaUy occurring sequence. 

20 The invention also encompasses production of polynucleotides which encode KPP and KPP 

derivatives, or fragments thereof, entirely by syntiietic chemistry. After production, the synthetic 
polynucleotide may be inserted into any of the many available expression vectors and cell systems 
using reagents well known in the art. Moreover, synthetic chemistry may be used to introduce 
mutations into a polynucleotide encoding KPP or any fragment thereof. 

25 Embodiments of the invention can also include polynucleotides that are capable of hybridizing 

to the claimed polynucleotides, and, in particular, to those having the sequences shown in SEQ ID 
NO: 16-30 and fragments thereof, under various conditions of stringency (Wahl, G.M. and S.L. Berger 
(1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507 51 1)7 
Hybridization conditions, including annealing and wash conditions, are described in "Definitions." 

30 Methods for DNA sequencing are well known in flie art and may be used to practice any of 

the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
Biosystems), tiiermostable T7 polymerase (Amersham Biosciences, Piscataway NJ), or combinations 
of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification 

35 . system (Invitrogen, Carlsbad CA). Preferably, sequence preparation is automated with machines 
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such as the MICROLAB ,2200 Uqirid transfer system (Hanulton, Reno NV), PTC200 thennal cycler 
(MJ Research, Watertown MA) and ABI CATALYST 800 thennal cycler (AppUed Biosystems). 
Sequencing is then carried out using either the ABI 373 or 377 DNA sequencing system (Applied 
Biosystems), the MEGABACE 1000 DNA sequencing system (Amersham Biosciences), or other 
systems known in the art The resulting sequences are analyzed using a variety of algorithms which 
are well known in the art (Ausuhel et al., supra, ch. 7; Meyers, R.A. (1995) Molecular Biology and 
Bigtechn^Hoey^ Wiley VCH. New York NY, pp. 8S6-8S3). , 

The nucleic acids encoding KPP may be extended utiliang a partial nucleotide sequence and 
employing various PCR-based methods known m the art to detect upstream sequences, such as 
promoters and regulatory elements. For example, one method which may be employed, restriction-site 
PGR, uses universal and nested primers to amplify unknown sequence from genomic DNA within a 
clonmg vector (Saricar, G. (1993) PGR Methods Applic. 2:318-322). Another method, inverse PGR. 
uses inimerB that extend in diver^t directions to ampUfy unknown sequence fnaa a circularized 
template. ITie template is derived from restriction fragments comprising a known genonuc locus and 
15 surrounding sequences (IWgfia, T. et al. (1988) Nucleic Acids Res. 16:8186). A third method, capture 
PGR, involves PGR amplification of DNA fiagments adjacent to known sequences in human and 
yeast artificial chromosome DNA (Ugerstrom. M. et al. (1991) PGR Metiiods Applic. 1:111-1 19). In 
tfiis metfiod, multiple restriction enzyme digestions and ligations may be used to msert an engineered 
double-stranded sequence into a region of unknown sequence before perfcwniing PGR. Other 
20 metiiods which may be used to retrieve unknown sequences are known in die art (Parker, J.D. et al. 
(1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may use PGR. nested primers, and 
PROMOTERFINDER libraries (BD Glontech, Palo Alto GA) to walk genomic DNA. This 
procedure avoids die need to screen libraries and is useful in finding mtron/exon junctions. For all 
PCR-based metiiods, primers may be designed using commercially available software, such as 
OUGO 4.06 primer analysis software (National Biosciences, Plymoutii MN) or anotiier appropriate 
program, to be about 22 to 30 nucleotides in lengtfi, to have a GG content of about 509& or more, and 
to anneal to the template at temperatures of about 68"G to 72''C. 

When screening for full lengflidDNAs, it is preferable to uselibiariMAathavVbeen " 
size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 
sequences containing die 5* regions of genes, are preferable for situations in which an oligo d(T) 
library does not yield a fiilHengUi cDNA. Genomic libraries may be useful for extension of sequence 
into 5' non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze 
tiie size or confirm tiie nucleotide sequence of sequencing or PGR products. In particular, capillary 
sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide- 
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specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Oulpui/light intensity may be converted.to electrical signal using appropriate 
software (e.g.. GENOTYPER and SEQUENCE NAVIGATOR. Applied Biosystems). and the entire 
process from loading of samples to computer analysis and electronic data display may be computer 
5 controlled. Capillary electrophoresis is especially preferable for sequencing smaU DNA fiagments 
which may be present in limited amounts in a particular sample. 

In another embodiment of die invention, polynucleotides or fragments thereof which encode 
KPP may be cloned in recombinant DNA molecules that direct expression of KPP. or fragments or 
functional equivalents thereof, in appropriate host ceUs. Due to die inherent degeneracy of die genetic 
10 code, other polynucleotides which encode substantially die same or a fiinctionally equivalent 
polypeptides may be produced and used to express KPP. 

The polynucleotides of die invention can be engmeered using mediods generally known in die 
art in order to alter KPP^ncoding sequences for a variety of purposes mcluding, but not limited to, 
modification of die cloning, processing, and/or expression of die gene product DNA shuffling by 
15 random fragmentation and PGR reassembly of gene fiagments and syndietic oligonucleotides may be 
used to engineer die nucleotide sequences. For example. oUgonucIeotide-mediated site-directed 
mutagenesis may be used to introduce mutations diat create new restriction sites, alter glycosylation 
patterns, change codon preference, produce splice variants, and so forth. 

Hie nucleotides of die present invention may be subjected to DNA shuffling techniques such 
20 as MOLECULARBREEDING (Maxygen Inc.. Santa Qara CA; described in U.S. Patent No. 

5,837,458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or improve 
tfie biological properties of KPP, such as its biological or enzymatic activity or its abdity to bind to 
odier molecules or compounds. DNA shuffling is a process by which a library of gene variants is 
25 produced using PCR-mediated recombination of gene fiagments. The library is dien subjected to 
selection or screening procedures tiiat identify diose gene variants widi die desired properties. These 
preferred variants may dien be pooled and fiirther subjected to recursive rounds of DNA shuffling and 
selection/screenmg. ITius, genetic diversity is created through "artificial" biee<fing and ra^d 
molecular evolution. For example, fragments of a single gene containing random point mutations may 
30 be recombined, screened, and dien reshuffled until die desired properties are optimized. Alternatively, 
fragments of a given gene may be recombined widi fragments of homologous genes in die same gene 
femily, eidier firom die same or difPearent species, diereby maximizing die genetic diversity of multiple 
naturally occurring genes in a directed and controllable manner. 

In anoflier embodiment, polynucleotides encoding KPP may be syndiesized, in whole or in 
35 part, using one or more chemical mediods well known in die art (Carudiers, M.H. et al. (1980) 
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Nucleic Acids Symp. Ser. 7:215-223; Horn. T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232). 
Alternatively, KPP itself or a fragment thereof may be synthesized using chemical methods known in 
the art For example, peptide synthesis can be pafoimed using various solution-phase or soUd-phase 
techniques (Qeighton, T. (1984) Proteins. Stmcmrfts »« H Molecular Pmp «rt,w WH Fkeeman, New 
York NY, pp. 55-60; Roberge, J.Y. et al. (1995) Science 269:202-204), Automated synthesis may be 
achieved usmg the ABI 431A peptide synthesizer (Applied Biosystems). Additionally, die amino acid 
sequence of KPP, or any part diereof. may be altered during direct syntiiesis and/or combined witii 
sequences from otiier proteins, or any part Uiereof, to produce a variant polypeptide or a polypeptide 
haying a sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative hi^ perfoimance liquid 
chromatography (Chiez, R.M. and RZ. Regnier (1990) Methods Enzymol. 182:392-421). Ihe 
composition of the synthetic peptides may be confinned by amino acid analysis or by sequencing 
(Creighton, supra, pp. 28-53). 

In order to express a biologicaUy active KPP, die polynucleotides encoding KPP or derivatives 
15 thereof may be inserted into an appropriate expression vector, i.e.. a vector which contains Ifae 

necessary elements for transcriptional and translational control of the inserted coding sequence in a 
suitable host Ihese elements include legulatoiy sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3* unttanslated regions in tiie vector and in polynucleotides encoding 
KPP. Such elements may vary in their stiengtii and specificity. Specific mitiation signals may also be 
20 used to achieve more efficient translation of polynucleotides encoding KPP. Such signals include die 
ATG initiation codon and adjacent sequences, e.g. die Kozak sequence. In cases where a 
polynucleotide sequence encoding KPP and its initiation codon and upstream regulatory sequences are 
inserted into die appropriate expression vector, no additional transcriptional or translational control 
signak may be needed. However, in cases where only coding sequence, or a fragment fliereof, is 
inserted, exogenous transbtional control signals includmg an in-firame ATG initiation codon should be 
provided by die vector. Exogenous translational elements and mitiation codons may be of various 
origins, botfi natural and syndietic. TTie efficiency of expression may be enhanced by die inclusion of 
enhancers appropriate for the particular host cell system usedtSchMf. d1 et al! (1^4) RMufts ftobl. 
Cell Differ. 20:125-162). 

Mediods which are well known to those skilled in die art may be used to construct expression 
vectors containing polynucleotides encoding KPP and appropriate transcriptional and translational 
control elements. These mediods include in vitro recombinant DNA techniques, synflietic techniques, 
and iR VIVO genetic recombination (Sambrook and Russell, stq,ra, ch. 1-4. and 8; Ausubel et al., 
supra, ch. 1, 3, and IS). 

A variety of expression vector/host systems may be utilized to contain and express 
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liolynucleotides encoding KPP. Those include, but are not limited to^ microorganisms such as bacteria 
transfomied with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast 
transformed with yeast expression vectors; insect cell systems infected with viral expression vectors 
(e.g., baculovirus); plant cell systems transformed witfi viral expression vectors (e.g., cauliflower 

5 mosaic virus, CaMV. or tobacco mosaic virus, TMV) or with bacterial e3q[»ression vectors (e.g., Ti or 
pBR322 plasmids); or animal cell systems (Sambrook and Russell, jupra; Ausubel et al., suprai Van 
Heeke, G. and S.M. Schuster (1989) J. Biol. Chem, 264:5503-5509; Engelhard. E.IL et al. (1994) 
Proc. Nad. Acad. Sci. USA 91:3224-3227; Sandig, V. et al, (1996) Hum. Gene TTier. 7:1937-1945; 
Takamatsu, N. (1987) EMBO J. 6:307-3 1 1 ; The McGraw Hill Yearbook of Science and Technology 

10 (1992) McGraw Hill, New York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Nad. Acad. 
Sci. USA 81:3655-3659; Harrington, J.J. et al. (1997) Nat. Genet 15:345-355). Expression vectors 
derived from retroviruses, adenoviruses, or herpes or vaccinia viruses, or from various bacterial 
plasmids, may be used for delivery of polynucleotides to the targeted organ, tissue, or cell population 
(Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5:350-356; Yu, M. et al. (1993) Proc. Natl. Acad. Sci. 

15 USA 90:6340-6344; Buller, R.M. et al. (1985) Nature 317:813-815; McGregor, D.P. et al. (1994) Mol. 
bnmunol. 31:219-226; Verma, LM. and N. Somia (1997) Nature 389:239-242). The invention is not 
limited by the host cell employed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intended for polynucleotides encoding KPP. For example, routine cloning, subcloning, 

20 and propagation of polynucleotides encoding KPP can be achieved using a multifunctional E. coli 
vector such as PBLUESCaEOPT (Stratagene, La JoUa CA) or PSPORTl plasmid (Invltrogen). 
Ligation of polynucleotides encoding KPP into the vector^s multiple cloning site disrupts the /ocZ gene, 
allowing a colorimetric screening procedure for identification of transformed bacteria containing 
recombinant molecules. In addition, these vectors may be useful for in vitro transcription, dideoxy 

25 sequencing, single strand rescue with helper phagp, and creation of nested deletions in the cloned 
sequence (Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509). When large 
quantities of KPP are needed, e.g. for the production of antibodies, vectors which direct high level 
expression of KPP may be used. For example, vectors containing the strong, inducibVsP6 or T7 
bacteriophage promoter may be used. - 

30 Yeast expression systems may be used for production of KPP. A number of vectors 

containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoris. In addition, such 
vectors direct either the secretion or intracellular retention of expressed proteins and enable 
integration of foreign polynucleotide sequences into the host genome for stable propagation (Ausubel 

35 et al., supra\ Bitter, G.A. et al. (1987) Metiiods Enzymol. 153:516-544; Scorer, CA. et al. (1994) 
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Bio/Technology 12:181-184). 

Plant systems may also be used for expression of KPP. Transcription of polynucleotides 
encoding KPP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 
5 6:307-31 1). Alternatively, plant promoters such as the small subunit of RTJBISCO or heat shock 
promoters may be used (Coruza, G. et aL (1984) EMBO J. 3:1671-1680; Broglie, R, et aL (1984) 
Science 224:838-843; Winter, J. et al. (1991) Results PtobL CeU Differ. 17:85-105). These constructs 
can be introduced into plant cells by direct DNA transformation or pathogen-mediated transfection 
(The McGraw Hill Yearbook of Scien ce and Technnlnpy (1992) McGraw HiU, New York NY, pp. 
10 191-196). 

In manmialian cells, a number of viral-based expression systems may be utilized. In cases 
wheze an adenovirus is used as an expression vector, polynucleotides encoding KPP may be ligated 
into an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader 
sequence. Insertion in a non-essential El or E3 region of Ae viral genome may be used to obtain 

15 mfective virus which expresses KPP in host cells (Logan, J. and T. Shenk (1984) Proc. NaU. Acad. 
ScL USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus ORSV) 
enhancer, may be used to increase expression in mammalian host cells. SV40 or BB V-based vectors 
may also be used for hig^-level protein expression. 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 

20 DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 

constracted and delivered via conventional delivery methods Qiposomes, polycationic amino polymers, 
or vesicles) for therapeutic purposes (Harrington, JJ. et al. (1997) Nat Genet. 15:345-355). 

For long term production of recombinant proteins in mammalian systems, stable expression of 
KPP in cell lines is preferred. For example, polynucleotides encoding KPP can be transformed into 

25 cell lines using expression vectors which may contain viral origins of replication and/or endogenous 
expression elements and a selectable marker gene on the same or on a separate vector. Following the 
introduction of the vector, cells may be allowed to grow for about 1 to 2 days in enriched media before 
being switched to selective media. The purpose of the selectable marker is to confer resistance to a 
selective agent, and its presence allows growth and recovery of cells which successfully express the 

30 introduced sequences. Resistant clones of stably transformed cells may be propagated using tissue 
culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. Hiese 
include, but are not limited to, the herpes simplex virus thymidine kinase and adenine 
phosphoribosyltransferase g^nes, for use in tkr md apr cells, respectively (Wigler, M. et al. (1977) 

35 Cell 1 1:223-232; Lowy, 1. et al. (1980) Cell 22:817-823). Also, antimetabolite, antibiotic, or herbicide 
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lesistance can be used as the basis for selection. For example, dhfr confers resistance to 
methotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 
confer resistance to chlorsulfuron and phosphinotricin acetyltransferase» respectively (Wigler, M. et al. 
, (1980) Proc. Natl. Acad, Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. MoL BioL 
5 150: 1-14). Additional selectable genes have been described, e.g., trpB and hisD^ which al^ cellular 
requirements for metabolites (Hartman, S.C. and R.C. MulUgan (1988) Pioc. NaU. Acad. Sci. USA 
85:8047-8051). Visible markers, e.g., anfhocyanins, green fluorescent proteins (GEP; BD Clontech), 
^-glucuronidase and its substrate P-glucuronide, or luciferase and its substrate luciferin may be used. 
These markers can be used not only to identify transformants, but also to quantify the amount of 

10 transient or stable protein expression attributable to a specific vector system (Rhodes, CA. (1995) 
Methods Mol. BioL 55:121-131). 

Although the presence/absence of marker gene expresdon suggests that the gene of interest 
is also present, the presence and expression of die gene may need to be confirmed. For example, if 
die sequence encoding KPP is inserted widiin a marker gene sequence, transformed cells containing 

15 polynucleotides encoding KPP can be identified by die absence of marker gene fimction. 

Alternatively, a marker ^ne can be placed in tandem with a sequence encoding KPP under die 
control of a single promoter. Expression of the marker gene in response to induction or selection 
usually indicates expression of the tandem gene as well. 

In general, host cells that contain the polynucleotide encoding KPP and that express KPP may 

20 be identified by a variety of procedures known to tiiose of skill in flie art. These procedures include, 
but are not limited to, DNA-DNA or DNA-KNA hybridizations, PCR amplification, and protein 
bioassay or immunoassay techniques which mclude membrane, solution, or chip based technologies for 
the detection and/or quantification of nucleic acid or protein sequences. 

Immunological methods for detecting and measuring the expression of KPP using eidier 

25 specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
include enzyme-linked immunosorbent assays QSLISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell scnrting (FACS). A two-site, monoclonal-based unmunoassay utilizing 
monoclonal antibodies reactive to two non-interferiiig epitopes on KPP is preferred, but a competitive 
binding assay may be employed. These and other assays are well known in the art (Hampton, R. et 

30 al. (1990) Serological Methods, a L aboratorv Manual. APS Press, St. Paul MN, Sect IV; Coligan, 
J.E. et al. (1997) Current Protocols in Irrt munologv , Greene Pub. Associates and Wiley-Interscience, 
New York NY; Pound, J.D. (1998) Inmiunochemical Protocols. Humana Press, Totowa NJ). 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
may be used in various nucleic acid and amino acid assays. Means for producing labeled hybridization 

35 or PCR probes for detecting sequences related to polynucleotides encoding KPP include oligolabeling, 
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nick translation, end-labeUng, or PGR amplification using a labeled nucleotide. Alternatively, 
polynucleotides encoding KPP. or any fragments thereof, may be cloned into a vector for tiie 
production of an mRNA probe. Such vectors are known in die art. are commerciany available, and 
may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase such 
as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety of 
commercially available kits, such as tiiose provided by Amersham Biosciences. Promega (Madison 
WD, and US Biochemical. Suitable report molecules or labels which may be used fm ease of 
detection include radionuclides, enzymes, fluorescent, chemiluminescent. or chiomogenic agents, as 
well as substrates, cofactors. inhibittxs, magnetic particles, and Ae like. 

Host cells transformed witii polynucleotides encoding KPP may be cultured under conditions 
suitable for the expression and recovery of die protein from cell culture. Hie protein produced by a 
transformed cell may be secreted or retained intiacellularly depending on the sequence and/or tfie 
vector used. As wiU be understood by tiiose of skiU in die art, expression vectors containmg 
polynucleotides which encode KPP may be designed to contain signal sequences which direct 
secretion of KPP throu^ a prakaryotic or eukaryotic cell membrane. 

fa addition, a host cell stiain may be chosen for its ability to modulate expression of die 
inserted polynucleotides or to process die expressed protein in tfie desired fashion. Such modifications 
of the polypeptide include, but are not limited to, acetylation. caiboxylation, glycosylation, 
phosphorylation, lipidation, and,acylation. Post-translational processing which cleaves a "prepro" <x 
"pro" form of die protdn may also be used to specify protein targeting, folding, and/or activity. 
Different host cells which have specific cellular machinery and characteristic mechanisms for 
post-ttanslational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from die 
American 1^ Culture Collection (ATCC, Manassas VA) and may be chosen to ensure die correct 
modification and processing of the foreign protein. 

fa anotfier embodiment of die invention, natural, modified, or recombinant polynucleotides 
encoding KPP may be ligated to a heterologous sequence resulting in translation of a fusion protein in 
any of die aforementioned host systems. For example, a chimeric KPP protein containing a 
heterologous moiety diat can be recognized by a commercially available antibody may facilitoteSe 
screening of peptide libraries for inhibitors of KPP activity. Heterologous protein and peptide moieties 
may also facilitate purification of fusion proteins using commercially available affinity matrices. Such 
moieties include, but are not limited to, glutadiione S-tnmsferase (GST), maltose binding protein 
(MBP), diioredoxin CIVx). calmodulin bmding peptide (CBP). 6-His. FLAG, c-myc, and hemagglutinin 
(HA). GST, MBP, Ttk, CBP, and 6-His enable purification of tfieir cognate fusion proteins on 
immobilized glutadiione, maltose, phenylarsine oxide, calmodulin, and metal-chelate lesins, 
respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of fusion 
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proteins using commercially available monoclonal and polyclonal antibodies that specifically recognize 
these epitope tags. A fusion protein may also be engineered to contain a proteolytic cleavage site 
located between the KPP encoding sequence and the heterologous protein sequence, so that KPP 
may be cleaved away from the heterologous moiety following purification. Methods for fusion protein 
5 expression and purification are discussed in Ausubel et al. {supra, ch. 10 and 16). A variety of ' 
commercially available kits may also be used to facilitate expression and purification of fusion 
proteins. 

In another embodiment, synthesis of radiolabeled KPP may be achieved in vitro using the 
TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems couple 

10 transcription and translation of protein-coding sequences operably associated with the T7, T3, or SP6 
promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, far 
example, ^S-methionine. 

KPP, fragments of KPP, or variants of KPP may be used to screen for compounds that 
specifically bind to KPP. One or more test compounds may be screened for specific binding to KPP. 

15 In various embodiments, 1, 2, 3, 4, 5, 10, 20, 50, 100, or 200 lest compounds can be screened for 
8i>ecific binding to KPP. Examples of test compounds can include antibodies, anticalins, 
oligonucleotides, proteins (e.g., ligands or receptors), or small molecules. 

Jn related embodiments, variants of KPP can be used to screen for binding of test conipounds, 
such as antibodies, to KPP, a variant of KPP, or a combination of KPP and/or one or more variants 

20 KPP. In an embodiment, a variant of KPP can be used to screen for compounds that bind to a variant 
of KPP, but not to KPP having ttie exact sequence of a sequence of SEQ ID NO: 1-15. KPP variants 
used to perform such screening can have a range of about 50% to about 99% sequence identity to 
KPP, with various embodiments havmg^60%, 70%, 75%, 80%, 85%, 90%, and 95% sequence identity. 
In an embodiment, a conq>ound identified in a screen for specific binding to KPP can be 

25 closely related to the natural li^d of KPP, e.g., a ligand or fragment th^eof, a natural substrate, a 
structural or functional mimetic, or a natural binding partner (Coligan, J.E. et al. (1991) Current 
Protocols in Immunologv l(2):Chapter 5). In anotiier embodiment, die compound thus identified can 
be a natural ligand of a receptor KPP (Howard, A,b; ef al. (2001) Trends Phannacbirs'ci.22: 132^1^^^ 
Wise, A. et al. (2002) Drug Discovery Today 7:235-246). 

30 In other embodiments, a compound identified in a screen for specific binding to KPP can be 

closely related to the natural receptor to which KPP binds, at least a fragment of the receptor, or a 
fragment of the receptor including all or a portion of the ligand binding site or binding pocket. For 
example, the compound may be a receptor for KPP which is capable of propagating a signal, or a 
decoy receptor for KPP which is not capable of propagating a signal (Ashkenazi, A. and V.M. Divit 

35 (1999) Curr. Opin. Cell Biol. 1 1:255-260; Mantovani, A. et al. (2001) Ti^nds Immunol. 22:328-336). 
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Hie compound can be rationally designed using known techniques. Examples of such techniques 
include those used to construct the compound etanercept (ENBREL; Am^ Inc., Thousand Oaks 
CA)t which is efficacious for treating rheumatoid arthritis in humans. Etanercept is an engineered p7S 
tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human IgGj (Taylor, P*C. et al. 

5 (2001) Curr. Opin. Inmiunol. 13:61 1*616). 

In one embodiment, two or more antibodies having similar or, alternatively, diff(mnt 
specificities can be screened for specific binding to KPP, fragments of KPP, or variants of KFP. Hie 
binding specificity of the antibodies dins screened can thereby be selected to identify particular 
fragments or variants of KPP. In one embodiment, an antibody can be selected such diat its binding 

10 specificity allows for preferential idratification of specific fragments or variants of KPP. In another 
embodiment, an antibody can be selected such that its binding specificity allows for prefiorential ^ 
diagnosis of a specific disease or condition having increased, decreased, or otherwise abnormal 
production of KPP. 

In an embodiment, anticalins can be screened for specific binding to KPP, fragments of KPP, 

15 or variants of KPP. Anticalins are ligand-binding proteins that have been constructed based on a 
lipocalin scaffold (Weiss, G.A. and H.B. Lowman (2000) Chem. Biol. 7:R177-R184; Skerra» A. 
(2001) J. Biotechnol. 74:257-275). The protein architecture of lipocalins can include a beta-barrel 
having eight antiparallel beta-strands, which supports four loops at its open end. These loops form the 
natural ligand-binding site of the lipocalins, a site which can be re-engineered in vitro by amino acid 

20 substitutions to impart novel binding specificities. The amino acid substitutions can be made using 
methods known in the art or described herein, and can include conservative substitutions (e.g., 
substitutions that do not alter binding specificity) or substitutions that modestiy, modemtely, or 
significantiy alter binding specificity. 

In one embodiment, screening for compounds which specifically bind to, stimulate, or inhibit 

25 KPP involves producing appropriate cells which express KPP, either as a secreted protein or on the 
cell membrane. Preferred cells can include cells from mammals, yeast, Drosophila^ or E. colL Cells 
expressing KPP or cell membrane fractions which contain KPP are then contacted with a test 
compound and binding, stimulation, or inhibition of activity of either KPP or the compound is analyzed. 
An assay may simply test binding of a test compound to the polypeptide, wherein binding is 

30 detected by a fiuorophore, radioisotope, enzyme conjugate, or other detectable label. For example, the 
assay may comprise the steps of combining at least one test compound with KPP, either in solution or 
affixed to a solid support, and detecting the binding of KPP to the compound. Alternatively, the assay 
may detect or measure binding of a test compound in the presence of a labeled competitor. 
Additionally, the assay may be carried out using cell-free preparations, chemical libraries, or natural 

35 product mixtures, and the test compound(s) may be free in solution or affixed to a solid support. 
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An assay can be used to assess the ability of a compound to bind to its natural ligand and/or to 
inhibit die binding of its natural ligand to its natural receptors. Examples of such assays include radio- 
labeling assays such as diose described in U.S. Patent No. 5,914,236 and U.S. Patent No. 6,372,724. 
In a related embodiment, one or more amino acid substitutions can be introduced into a polyp^tide 

5 compound (such as a receptor) to improve or alter its ability to bind to its natural ligands (Matthews, 
D J. and J.A. Wells. (1994) Chem. Biol. 1 :2S-30). In anoth^ related embodiment, one or more amino 
acid substitutions can be introduced into a polypeptide compound (such as a ligand) to improve or alter 
its ability to bind to its natural receptors (Cunningham, B.C. and J.A. Wells (1991) Proc. Nad. Acad. 
Sci. USA 88:3407-3411; Lowman, HJB. et al. (1991) J. Biol. Chem. 266:10982-10988). 

10 KPP, fragments of KPP, or variants of KPP may be used to screen for compounds that 

modulate the activity of KPP. Such compounds may include agonists, antagonists, or partial or inverse, 
agonists. In one embodiment, an assay is performed und^ conditions pmnissive for KPP activity, 
wherem KPP is combined with at least one test compound, and the activity of KPP in the presence of 
a test compound is compared with the activity of KPP in the absence of die test compound. A 

IS change in the activity of KPP in the presence of the test compound is indicative of a compound that 
modulates the activity of KPP. Alternatively, a test compound is combined with an in vitro or cell- 
free system comprising KPP under conditions suitable for KPP activity, and the assay is performed. 
In either of these assays, a test compound which modulates the activity of KPP may do so indirecdy 
and need not come in direct contact with the test compound. At least one and up to a plurality of test 

20 compounds may be screened. 

In another embodimient, polynucleotides encoding KPP or their mammalian homologs ngiay be 
^'knocked out" in an animal model system using homologous recombination in embryonic stem (ES) 
cells. Such techniques are well known in the art and are useful for the generation of animal models of 
human disease (see, e.g., U.S. Patent No. S,17S,383 and U.S. Patent No. 5,767,337). For example, 

25 mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and 
grown in culture. The ES cells are transformed with a vector containing the gene of interest disrupted 
by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 
244:1288-1292). The vector integrates into die corresponding region of the host genome by 
homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP 

30 system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. 
(1996) Clin. Invest 97:1999-2002; Wagner. K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). 
Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and 
die resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous 

35 strains. Transgenic animals thus generated may be tested widi potential therapeutic or toxic agents. 
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Polynucleotides encoding KPP may also be manipulated in vitro in ES cells derived from 
human blastocysts. Human ES cells have the potential to diffimntiate into at least eight separate cell 
lineages including endodenn, mesodenn, and ectodermal cell types. These cell lineages diffeientiate 
into, for example, neural ceUs, hematopoietic lineage, and cardiomyocy tes CThomson, J.A. et al. 

5 (1998) Science 282:1143-1147). 

Polynucleotides encoding KPP can also be used to create '*knockin" humanized animals (pigs) 
or transgenic animals (mice or rats) to model human disease. With tmocldn technology, a region of a 
polynucleotide encoding KPP is injected into animal ES cells, and the injected sequence integrates into 
the animal cell genome. Transformed cells are injected into blastulae, and the blastulae are implanted 

10 as described above. Trans^ic progeny or inbred lines are smdied and treated with potential 
pharmaceutical agents to obtain information on treatment of a human disease. Alternatively^ a 
mammal inbred to overexpress KPP, e.g., by secreting KPP in its milk, may also serve as a 
convenient source of tiiat protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 
THERAPEUTICS 

15 Chemical and structural similarity, e.g., in the context of sequences and motifs, exists between 

regions of KPP and kinases and phosphatases. In addition, examples of tissues expressing KPP can 
be found in Table 6 and can also be found in Example XI. Therefore, KPP appears to play a role in 
cardiovascular diseases, immune system disorders, neurological disorders, disorders affecting growth 
and development, lipid disorders, cell proliferative disorders, and cancers. In the treatment of 

20 disord^s associated with increased KPP expression or activity, it is desirable to decrease the 
expression or activity of KPP. In the treatment of disorders associated with decreased KPP 
expression or activity, it is desirable to increase the expression or activity of KPP. 

Therefore, in one embodiment, KPP or a fragment or derivative thereof may be administered 
to a subject to treat or prevent a disorder associated with decreased expression or activity of KPP. 

25 Examples of such disorders include, but are not limited to, a cardiovascular disease such as 

arteriovenous fistula, atherosclerosis, hypertension, vasculitis, Raynaud's disease, aneurysms, arterial 
dissections, varicose veins, tfarombq[»hlebitis and phlebothrombosis, vascular tumors, and complications 
of thrombolysis, balloon angioplasty, vascular replacement, and coronary artery bypass graft surgeTy7 
congestive heart failure, ischemic heart disease, angina pectoris, myocardial infarction, hypertensive 

30 heart disease, degenerative valvular heart disease, calcific aortic valve stenosis, congenitally bicuspid 
aortic valve, mitral annular calcification, mitral valve prolapse, rheumatic fever and rheumatic heart 
disease, infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus 
erythematosus, carcinoid heart disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart 
disease, congenital heart disease, and complications of cardiac transplantation, congenital lung 

35 anomalies, atelectasis, pulmonary conation and edema, pulmonary embolism, pulmonary 
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hemorrhage, puhnonaiy infarction, pulmonary hypertension, vascular sclerosis, obstructive pulmonary 
disease, restrictive pulmonary disease, chronic obstructive pulmonary disease, emphysema, chronic 
bronchitis, bronchial asthma, bronchiectasis, bacterial pneumonia, viral and mycoplasmal pneumonia, 
lung abscess, pulmonary tuberculosis, diffuse interstitial diseases, pneumoconioses, sarcoidosis, 
5 idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, 
puhnonary eosinophilia bronchiolitis oblitenms-organizing pneumonia, diffuse pulmonary hemorrhage 
syndromes, Goodpasture's syndromes, idiopathic pulmonary hemosiderosis, puhnonary involvement in 
collagen-vascular disorders, pulmonary alveolar proteinosis, lung tumors, inflammatory and 
noninflammatory pleural effusions, pneumotiiorax, pleural tumors, drug-induced lung disease, radiation- 

10 induced lung disease, and complications of lung transplantation; an immune system disorder such as 
acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratcny distress syndrome, 
allergies, anlgrlosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic 
anemia, autoimmune thyroiditis, autoinunune polyendocrinopathy-candidiasis-ectodermal dystrophy 
(APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, 

15 dennatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lymphocy totoxins, 
eryduxiblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's 
syndrome, gout. Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, 
multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, 
osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, 

20 Sj5gren*s syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, 
tiirombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, 
hemodialysis, and extrac<»poreal circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic 
infections, and trauma; a neurolo^cal disorder such as epilepsy, ischemic cerebrovascular disease, 
stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, dementia, 

25 Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor 
neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple 
sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural 
empyema, epidural abscess, suppurative intracranial thrY>mbophlebitis, myefitis and radiculitis, viral 
central nervous system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and 

30 Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases 
of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, 
encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central 
nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous 
system disorders, cranial nerve disordera, spinal cord diseases, muscular dystrophy and other 

35 neuromuscular disorders, peripheral nervous system disorders, dennatomyositis and polymyositis, 
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inherited, metabolic, endocrine, and toxic myopathies, myastfienia gravis, periodic paralysis, mental 
disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), 
akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, 
postherpetic neuralgia, Touretle*5 disorder, progressive supranuclear palsy, corticobasal de^neration, 

5 and familial frontotemporal dementia; a disorder affecting growth and development such as actinic 
keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease 
(MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary 
thrombocytfaemia, renal tubular acidosis, anemia, Cushing*s syndrome, achondroplastic dwarfism, 
Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms* 

10 tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, 
myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary 
neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, 
hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, spina bifida, 
anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss; a lipid 

15 disorder such as fatty liver, cholestasis, primary biliary cirriiosis, carnitine deficiency, camituie 
palmitoyltransferase deficiency, myoadenylate deaminase deficiency, hypertriglyceridemia, lipid 
storage disorders such Fabry's disease, Gaucher's disease, Niemann-Pick's disease, metachromatic 
leukodystrophy, adrenoleukodystrophy, GM2 gangliosidosis, and ceroid lipofuscinosis, 
abetalipoproteinemia, Tangier disease, hyperlipoproteinemia, diabetes mellitus, lipodystrophy, 

20 lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, lipoid adrenal hyperplasia, 
minimal change disease, lipomas, atherosclerosis, hypercholesterolemia, hypercholesterolemia with 
hypertriglyceridemia, primary hypoalphalipoproteinemia, hypothyroidism, renal disease, liver disease, 
lecithinxholesterol acyltransferase deficiency, cerebrotendinous xanthomatosis, sitosterolemia, 
hypocholesterolemia, Tay-Sachs disease, Sandhoff s disease, hyperlipidemia, hyperlipemia, lipid 

25 myopathies, and obesity; and a cell proliferative disorder such as actinic keratosis, arteriosclerosis, 
atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, 
paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and 
cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, 
teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, 

30 breast, cervix, colon, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, 
ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, diyroid, 
uterus, leukemias such as multiple myeloma, and lymphomas such as Hodgkin's disease. 

In another embodiment, a vector capable of expressing KPP or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 

35 expression or activity of KPP including, but not limited to, those described above. 
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In a further embodiment^ a composition comprising a substantially purified KPP in conjunction 
with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent a disord^ 
associated with decreased expression or activity of KPP including, but not limited to, those provided 
above. 

5 In still another embodiment, an agonist which modulates the activity of KPP may be 

administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of KPP including, but not limited to, those listed above. 

Jxk a furdiar embodiment, an antagonist of KPP may be administered to a subject to treat or 
prevent a disorder associated with inoreased expression or activity of KPP. Examples of such 

10 disorders include, but are not limited to, those cardiovascular diseases, immune system disorders, 
neurological disorders, disorders affecting growth and development, lipid disorders, cell proliferative 
disorders, and cancers described above. In one aspect, an antibody which specifically binds KPP may 
be used direcdy as an antagonist or indirecdy as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express KPP. 

IS In an additional embodiment, a vector expressing the complement of the polynucleotide 

encoding KPP may be administered to a subject to treat or prevent a disord^ associated with 
increased expression or activity of KPP including, but not limited to, those described above. 

In other embodiments, any protein, agonist, antagonist, antibody, complementary sequence, or 
vector embodiments may be administered in combination with other appropriate therapeutic agents. 

20 Selection of the appropriate agents for use in combination therapy may be made by one of ordinary 
skill in the art, according to conventional pharmaceutical principles. The combination of therapeutic 
agents may act synergistically to effect the treatment or prevention of the various disorders described 
above. Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of 
each agent, thus reducing the potential for adverse side effects. 

25 An antagonist of KPP may be produced using methods which are generally known in the art. 

In particular, purified KPP may be used to produce antibodies or to screen libraries of pharmaceutical 
agents to identify those which specifically bind KPP. Antibodies to KPP may also be generated using 
methods that are well known in the art. Such antibodies may include, but are not limited to, polyclonal, 
monoclonal, chimeric, and single chain antibodies. Fab fragments, and fragments produced by a Fab 

30 expression library. In an embodiment, neutralizing antibodies (i.e., those which inhibit dimer formation) - 
can be used therapeutically. Single chain antibodies (e.g., from camels or llamas) may be potent 
enzyme inhibitors and may have application in the design of peptide mimetics, and in the development 
of immuno-adsorbents and biosensors (Muyldermans, S. (2001) J. Biotechnol. 74:277-302). 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, 

35 dromedaries, llamas, humans, and others may be inununized by injection with KPP or with any 
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fragment or oligopeptide ttieieof which has immunogenic properties. Dqiending on the host species, 
various adjuvants may be used to increase immunological response. Such adjuvants include, but are 
not limited to, Freund*s, min^^ gels such as aluminum hydroxide, and surface active substances such 
as lysolecithin, pluronic polyols, polyanions, peptides, oil CTnulsions, KLH, and dinitrophenol. Among 
5 adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum are 
especially prefimble. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to KPP 
^ have an amino acid sequence consisting of at least about S amino acids, and generally will consist of at 
least about 10 amino acids. It is also preferable that diese oligopeptides, peptides, or fragments are 
10 substantially identical to a portion of the amino acid sequence of the natural protein. Short stretches of 
KPP amino acids may be fused with those of another protein, such as KLH, and antibodies to the 
chimeric molecule may be produced. 

Monoclonal antibodies to KPP may be prepared using any technique which provides for ttie 
production of antibody molecules by continuous cell lines in culture. These include, but ate not limited 
IS to, die hybridoma technique, the human B-cell hybridoma technique, and die EBV-hybridoma 

technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. Immunol. Methods 
81:31-42; Cote, RJ. et al. (1983) Ptoc. Nad. Acad. Sci. USA 80:2026-2030; Cole, S.P. et al. (1984) 
Mol. CeU Biol. 62:109-120). 

In addition, techniques developed for die production of "chimeric antibodies," such as die 
20 splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used (Morrison, S.L. et al. (1984) Proc. Nad. Acad. 
Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; Takeda, S. et al. (1985) 
Nature 314:452-454). Alternatively, techniques described for the production .of single chain antibodies 
may be adapted, using methods known in die art, to produce KPP-specific single chain antibodies. 
25 Antibodies with related specificity, but of distinct idiotypic composition, may be generated by chsdn 
shuffling from random combinatorial immunoglobulin libraries (Burton, D.R. (1991) Proc. Nad. Acad. 
Sci. USA 88:10134-10137). 

Antibodies may also be produced by inducing in vivo production in the lymphocyte population 
or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in 
30 the literature (Orlandi, R. et al. (1989) Proc. Nad. Acad. Sci. USA 86:3833*3837; Winter, O. et al. 
(1991) Nature 349:293-299). 

Antibody fragments which contain specific binding sites for KPP may also be generated. For 
example, such fragments include, but are not limited to, F(ab')2 fragments produced by pepsin 
digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
35 the F(ab*)2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and 
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easy identification of moTioclonal Fab fragments with the desired specificity (Huse, WJ3. et at (1989) 
Science 246:1275-1281). 

Various inununoassays may be used for screening to identify antibodies having the desired 
specificity. Numerous protocols for competitive binding or inununoradiometric assays using either 
5 polyclonal or monoclonal antibodies with established specificities are well known in ttie art. Such 
inununoassays typically involve the measurement of complex formation between KPP and its specific 
antibody. A two-site, monoclonal-based inmiunoassay utilizing monoclonal antibodies reactive to two 
non-interfering KPP epitopes is generally used, but a competitive binding assay may also be employed 
(Pound, supra), 

10 Various methods such as Scatchard analysis in conjunction witfi radioimmunoassay techniques 

may be used to assess the affinity of antibodies for KPP. Affinity is expressed as an association 
constant, K,, which is defined as the molar concentration of KPP-antibody complex divided by tiie 
molar concentrations of free antigen and free antibody under equilibrium conditions. The 
determined for a preparation of polyclonal antibodies, which are heterogeneous in flieir affinities for 

15 multiple KPP epitopes, represents the average affinity, or avidity, of the antibodies for KPP. The K^ 
determined for a preparation of monoclonal antibodies, which are monospecific for a particular KPP 
epitope, represents a true measure of affinity. High-affinity antibody preparations with K^ ranging 
from about 10' to 10'^ L/mole are preferred for use in inununoassays in which the KPP-antibody 
complex must withstand rigorous manipulations. Low-affinity antibody preparations with K^ ranging 

20 from about 10^ to 10^ L/mole are preferred for use in inmiunopurification and similar procedures 
which ultimately require dissociation of KPP, preferably in active form, from the antibody (Catty, D. 
(1988) Antibodies, Volume I: A Practical Approach, IRL Press, Washington DC; Liddell, J.E. and A. 
Cryer (1991) A Practical Guide to Monoclonal Antibodies, John Wiley & Sons, New York NY). 

The titer and avidity of polyclonal antibody preparations may be further evaluated to determine 

25 the quality and suitability of such preparations for certain downstream applications. For example, a 
polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, preferably 5-10 mg 
specific antibody/ml, is generally employed in procedures requiring precipitation of KPP-antibody 
complexes. Procedures for evaluating antibody specificity, titer, and avidity, and* guidelines for 
antibody quality and usage in various applications, are generally available (Catty, supra; Coligan et aL, 

30 supra). 

In another embodiment of the invention, polynucleotides encoding KPP, or any fragment or 
complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 
35 KPP. Such technology is well known in the art, and antisense oligonucleotides or larger fragments 
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can be designed from various locations along the coding or control regions of sequences encoding 
KPP (Agrawal, S., ed, (1996) Antisense Therapeutics . Humana Ptess, Totawa NJ). 

in therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
5 intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a pordon of the cellular sequence encoding the target protein (Slater, J JB. et 
al. (1998) J. Allergy Clin. InraiunoL 102:469-475; Scanlon. K.J. et al. (1995) FASEB J. 9:1288-1296). 
Antisense sequences can also be introduced intracellularly through die use of viral vectors, such as 
retrovirus and adeno-associated virus vectors (Miller, A.D. (1990) Blood 76:271-278; Ausubel et al., 
10 supra; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63:323-347). Ottier gene delivery 

mechanisms include liposome-derived systems, artificial viral envelopes, and odier systems known in 
the art (Rossi, JJ. (1995) Br. Med. Bull. 51:217-225; Boado, RJ. et al. (1998) J. Pharm. Sci. 87:1308- 
1315; Monris, M.C. et al. (1997) Nucleic Acids Res. 25:2730-2736). 

In another embodiment of the invention, polynucleotidi&s encoding KPP may be used for 
15 somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCID)-X1 disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475). 
20 cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 

Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial 
hypercholesterolemia, and hemophilia resulting from Factor Vin or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Venna, LM. and N. Somia (1997) Nature 389:239-242)), (ii> 
express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
25 cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., 
against human retroviruses, such as human inmiunodeficiency virus (HIV) (Baltimore, D. (1988) 
Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Nafl. Acad. Sci. USA 93^1395-1 1399)^hepatitis 
B or C virus (HBV. HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis\ and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi). In 
30 the case where a genetic deficiency in KPP expression or regulation causes disease, the expression of 
KPP from an appropriate population of transduced cells may alleviate the clinical manifestations 
caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in KPP 
are treated by constructing mammalian expression vectors encoding KPP and introducing these 
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vectors by mechanical means into KPP-deficient cells. Mechanical transfer technologies for use with 
cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic gold 
particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the 
use of DNA transposons (Morgan, Rj\. andW.F. Andmon (1993) Annu. Rev. Biochem. 62:191- 
5 217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J.-L. and H R&ipon (1998) Curr. Opin. BiotechnoL 
9:445-450). 

Expression vectors that may be effective for the expression of KPP include, but are not 
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 
(Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La JoUa CA), 
10 andPTET<)FF,PTET-ON.PTRE2.PTRE2-LUC,FnC-HYG(BD^^^ KPP 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous 
sarcoma vuns (RSV), S V40 virus, thymidine kinase (TK), or p-actin genes), (ii) an inducible promoter 
(e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. Natl. Acad. Sci. 
USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, F.M.V. andH.M. Blau 
15 (1998) Curr. Opin. BiotechnoL 9:451-456), commercially available in the T-REX plasmid (Invitrogen)); 
the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; Invitrogen); the 
FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, F.M.V. 
and H.M. Blau, supra)), or (iii) a tissue-specific promoter or the native promoter of the endogenous 
gene encoding KPP from a normal individual. 
20 Conunercially available liposome transformation kits (e.g., the PERFECT LIPID 

TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 
(Graham, F.L. and AJ. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
25 (1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of these 
standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to KPP expression are treated by constructing a retrovirus vector consisting of (i) tifie 
polynucleotide encoding KPP under the control of an independent promoter or the retrovirus long 
30 terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus cw-acting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 
commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. 
Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is propagated in 
35 an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
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receptors on the target cells or a promiscuous envelope protein such as VSVg (ArmentanOi D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A* and 
AJ). Mfller (1988) J- Virol. 62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. et 
^ al. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg ("Mettiod for obtaining 
S retrovirus packaging cell lines producing high transducing efficiency retroviral supematanf *) discloses 
a method for obtaining retrovirus packaging cell lines and is hereby incorporated by reference. 
Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4^ T-cells), and the 
return of transduced cells to a patient are procedures well known to persons skilled in the art of gene 
therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et 
10 al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; Ranga. U. et al. (1998) 
Proc. Nad. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

In an embodiment, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encoding KPP to cells which have one or more genetic abnormalities with respect to 
the expression of KPP. The construction and packaging of adenovirus-based vectors are well known 
15 to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be 
• versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
(Csete, ME, et al. (1995) Transplantation 27:263-268). Potentially useful adenovkal vectors are 
described in U.S. Patent No. 5,707,618 to Armentano ("Adenovirus vectors for gene therapy'*), 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P. A. et al. (1999; Annu. 
20 Rev. Nutr. 19:51 1-544) and Verma, I.M. and N. Somia (1997; Nature 18:389:239-242). 

In another embodiment, a herpes-based, gene therapy delivery system is used to deliver 
polynucleotides encoding KPP to target cells which have one or more genetic abnormalities with 
respect to the expression of KPP. The use of herpes simplex virus (HSV)-biased vectors may be 
especially valuable for introducing KPP to cells of the central nervous system, for which HSV has a 
25 tropism. The construction and packaging of herpes-based vectors are well known to those with 

ordinary skill in the art. A replication-competent herpes simplex virus (HSV) type 1-based vector has 
been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 
169:385-395). Hie construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca ("Herpes simplex vims strains for gene transfer")t which is hereby 
30 incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 
which consists of a genome containing at least one exogenous gene to be transferred to a cell under 
the control of the appropriate promoter for purposes including human gene therapy. Also taught by 
this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and 
ICP22. For HSV vectors, see also Goins, W.F. et al. (1999; J. Virol. 73:519-532) and Xu, H. et al. 
35 (1994; Dev. Biol. 163:152-161). The manipulation of cloned herpesvirus sequences, the generation of 
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recombinant vims following the transfection of multiple plasmids containing difii^nt segments of the 
large hopesvinis genomes, tfie growth and prqpag^on of herpesvirus, and the infection of cells with 
herpesvirus are techniques well known to those of ordinary skill in the art 

In another embodiment* an alphavirus (positive, single-stranded RNA virus) vector is used to 

5 deliver polynucleotides encoding KPP to target cells. Hie biology of die prototypic alphavirus, Semliki 
. Fcnrest Virus (SFSO* bas been studied extensively and gene transfer vectors have been based on the 
SFV genome (Oaroff, H. and IL-J. U (1998) Curr. Opin. Biotechnol. 9:464-469)- During alphavirus 
RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid proteins. 
Hus sub^nomic RNA replicates to higher levels than the full length genomic RNA, resulting in the 

10 overproduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g., protease 
and polymerase). Similarly, inserting the coding secjuence for KPP into the alphavirus genome in 
place of the capsid-coding region results in the production of a large number of KPP-coding RNAs 
and the synthesis of high levels of KPP in vector transduced cells. While alphavirus infection is 
typically associated with cell lysis witiiin a few days, the ability to establish a persistent infection in 

15 hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) indicates that the lytic 
replication of alphaviruses can be altered to suit the needs of the gene therapy application (Diyga, 
S.A. et al. (1997) Virology 228:74-83). TTie wide host range of alphaviruses will allow the introduction 
of KPP into a variety of cell types. The specific transduction of a subset of cells in a population nciay 
require the sorting of cells prior to transduction. The methods of manipulating infectious cDNA clones 

20 of alphaViruses, performing alphavirus cDNA and RNA transfections, and performing alphavirus 
infections, are well known to those with ordinary skill in the art. 

Oligonucleotides derived from the transcription initiation site, e.g., between about positions -10 
and +10 from the start site, may also be employed to inhibit gene expression. Similarly, inhibition can 
be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it 

25 causes inhibition of the ability of the double helix to open sufficientiy for the binding of polymerases, 
transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have 
been described in the literature (Gee, J.E. et al. (1994) in Huber, B.E. and B.I. Carr, Molecular and 
Immunologic Approaches . Fumra Publishing, Mt. Kisco NY, pp. 163-177). A complementary 
sequence or antisense molecule may also be designed to block translation of mRNA by preventing the 

30 transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficientiy catalyze 

35 endonucleolytic cleavage of RNA molecules encoding KPP. 
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Specific ribo^me cleavage sites within any potential RNA target are initially identified by 
scanning die target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short RNA sequences of between IS and 20 ribonucleotides, 
corresponding to the region of the target gene containing the cleavage site, may be evaluated for 
5 secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuciease protection assays. 

Complementary ribonucleic acid molecules and ribozymes may be prepared by any method 
known in the art for the synthesis of nucleic acid molecules.. These include techniques for chemically 
10 synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, 
RNA molecules may be generated by in vitro and in vivo transcription of DNA molecules encoding 
KPP. Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA 
polymerase promoters such as T7 or SP6. Alternatively, tiiese cDNA constructs that synthesize 
complementary RNA, constitutively or inducibly, can be introduced into cell lines, cells, or tissues. 
15 RNA molecules may be modified to increase intracellular stability and half-life. Possible 

modifications include, but are not limited to, the addition of flanking sequences at tfie 5' and/or 3* ends 
of the molecule, or the use of phosphorothioate or 2* 0«methyl rather than phosphodiesterase linkages 
within the backbone of the molecule. This concept is inherent in the production of PNAs and can be 
extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, 
20 and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytosine, 
guanine, thymine, and uracil which are not as easily recognized by endogenous endonucleases. 

In other embodiments of the invention, the expression of one or more selected polynucleotides 
of the present invention can be altered, inhibited, decreased, or silenced using RNA interference 
(RNAi) or post-transcriptional gene silencing (PTGS) methods known in the art. RNAi is a post- 
25 transcriptional mode of gene silencing in which double-stranded RNA (dsRNA) introduced into a 
targeted cell specifically suppresses the expression of the homologous gene (i.e., the gene bearing the 
sequence complementary to the dsRNA). This effectively knocks out or substantially reduces the 
expression of the targeted gene. PTGS can also be accomplished by use of DNA or DNA fragments 
as well. RNAi metfiods are described by Fire, A. et al. (1998; Nature 391 :806-8H) and Gum, T. 
30 (2000; Nature 404:804-808). PTGS can also be initiated by introduction of a complementary segment 
of DNA into the selected tissue using gene dehvery and/or viral vector delivery methods described 
herein or known in the art. 

RNAi can be induced in manmialian cells by the use of small interfering RNA also known as 
siRNA. siRNA are shorter segments of dsRNA (typically about 21 to 23 nucleotides in lengtii) that 
35 result in vivo from cleavage of introduced dsRNA by the action of an endogenous ribonuciease. 
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siRNA appear to be the mediators of the RNAi effect in mammals. The most effective siRNAs 
appear to be 21 nucleotide dsRNAs with 2 nucleotide 3* overhangs. The use of siRNA for inducing 
RNAi in mammalian cells is described by Elbashir, S.M. et al. (2001; Nature 41 1:494-498). 
siRNA can be generated indirecdy by introduction of dsRNA into the targeted cell. 
S Alternatively, siRNA can be synthesized directiy and introduced into a cell by transfection methods 
and agents described herein or known in the art (such as liposome-mediated transfection, viral vector 
methods, or other polynucleotide delivery/introductory methods). Suitable siRNAs can be selected by 
examining a transcript of the target polynucleotide (e.g., mRNA) for nucleotide sequences 
downstream from the AUG start codon and recording the occurrence of each nucleotide and the 3' 
10 adjacent 19 to 23 nucleotides as potential siRNA target sites, with sequences having a 21 nucleotide 
lengdi being preferred. Regions to be avoided for target siRNA sites include the 5* and 3' untranslated 
regions (UTRs) and regions near the start codon (within 75 bases), as these may be richer in 
regulatory protein binding sites. UTR-binding proteins and/or translation initiation complexes may 
interfere with binding of ttie siRNP endonuclease complex. The selected target sites for siRNA can 
15 then be compared to the appropriate genome database (e.g., human, etc.) using BLAST or other 
sequence comparison algorithms known in the art. Target sequences with significant homology to 
other coding sequences can be eliminated from consideration. The selected siEtNAs can be produced 
by chemical synthesis methods known in the art or by in vitro transcription using conunercially 
available methods and kits such as the SILENCER siRNA construction kit (Ambion, Austin TX). 
20 In alternative embodiments, long-term gene silencing and/or RNAi effects can be induced in 

selected tissue using expression vectors that continuously express siRNA. This can be accomplished 
using expression vectors that are engineered to express hairpin RNAs (shRNAs) using methods 
known in the art (see, e.g., Brummelkamp, T.R. et al. (2002) Science 296:550-553; and Paddison, PJ. 
et al. (2002) Genes Dev. 16:948-958). In these and related embodiments, shRNAs can be delivered to 
25 target cells using expression vectors known in the art. An example of a suitable exi»:ession vector for 
delivery of siRNA is the PSILENCER1.0-U6 (circular) plasmid (Ambion). Once delivered to the 
target tissue, shRNAs are processed in vivo into siRNA-like molecules capable of carrying out gene- 
specific silencing. 

In various embodiments, the expression levels of genes targeted by RNAi or PTGS methods 
30 can be determined by assays for mRNA and/or protein analysis. Expression levels of the mRNA of a 
targeted gene can be determined, for example, by northern analysis methods using the 
NORTHERNMAX-GLY kit (Ambion); by microarray methods; by PGR methods; by real time PGR 
methods; and by other RNA/polynucleotide assays known in the art or described herein. Expression 
levels of the protein encoded by the targeted gene can be determined, for example, by microarray 
35 methods; by polyacrylamide gel electrophoresis; and by Western analysis using standard techniques 
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known in the art. 

An additional embodinient of the invention encompasses a method for screening for a 
compound which is effective in altering expression of a polynucleotide encoding KPP. Compounds 
which may be effective in altering expression of a specific polynucleotide may include, but are not 

5 limited to» oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and oth^ polypeptide transcripdonal regulators, and non-macromolecular 
chemical entities which are capable of interacting with specific polynucleotide sequences. Effective 
compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 
polynucleotide expression. Thus, in die treatment of disordm associated with increased KPP 

10 expression or activity, a compound which specifically inhibits expression of the polynucleotide 
encoding KPP may be therapeutically useful, and in the treatment of disorders associated with 
decreased KPP expression or activity, a compound which specifically promotes expression of the 
polynucleotide encoding KPP may be therapeutically useful. 

In various embodiments, one or more test compounds nriay be screened for effectiveness in 

IS altering expression of a specific polynucleotide. A test compound may be obtained by any metiiod 
commonly known in the art, including chemical modification of a compound known to be effective in 
altering polynucleotide expression; selection from an existing, coimiiercially-available or proprietary 
library of naturally-occurring or non-natural chenucal compounds; rational design of a compound 
based on chemical and/or structural properties of die target polynucleotide; and selection from a 

20 library of chemical compounds created combinatorially or randomly. A sample comprising a 

polynucleotide encoding KPP is exposed to at least one test compound thus obtained. The sample 
may comprise, for example, an intact or permeabilized cell, or an in vitro cell-firee or reconstituted 
biochemical system. Alterations in the expression of a polynucleotide encoding KPP are assayed by 
any method commonly known in the art Typically, the expression of a specific nucleotide is detected 

25 by hybridization widi a probe having a nucleotide sequence complementary to the sequence of the 

polynucleotide encoding KFP. The amount of hybridization may be quantified, thus forming the basis 
for a comparison of the expression of the polynucleotide both with and without exposure to one or 
more test compounds. Detection of a change in the expression of a polynucleotide exposed to a test 
compound indicates that the test compound is effective in altering the expresdon of the polynucleotide. 

30 A screen for a compound effective in altering expression of a specific polynucleotide can be carried 
out, for example, using a Schizosaccharomyces pombe gene expression system (Atkins, D. et al. 
(1999) U.S. Patent No. 5,932,435; Amdt, G.M. et al. (2000) Nucleic Acids Res. 28:E15) or a human 
cell line such as HeLa cell (Clarke, M.L. et al. (2000) Biochem. Biophys. Res. Conunun. 268:8-13). 
A particular embodiment of the present invention involves screening a combinatorial library of 

35 oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nucleic acids, and modified 
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oligonucleotides) for antisense activity agaiiist a specific polynucleotide sequence (Bniice, T.W. et al. 
(1997) U.S. Patent No. 5,686,242; Bniice, T.W. et al. (2000) U.S. Patent No. 6,022,691). 

Many mediods for introducing vectors into cells or tissues are available and equally suitable 
for use in viva, in vitro^ and ex vivo. For ex vivo therapy, vectors may be introduced into stem cells 
S taken from tfie patient and clonally propagated for autologous transplant back into that same patient 
Deliveiy by transfection, by liposome injections, or by polycationic anaino polymers may be achieved 
using methods which are well known in the art (Goldman, C.K et al. (1997) Nat Biotechnol. 15:462- 
466). 

Any of the therapeutic methods described above may be applied to any subject in need of 
10 such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
monkeys. 

An additional embodiment of the invention relates to the administration of a composition which 

generally comprises an active ingredient formulated with a pharmaceutically acceptable excipient. 

Excipients may include, for example, sugars, starches, celluloses, gums, and proteins. Various 
15 formulations are commonly known and are thoroughly discussed in the latest edition of Remington's 

Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions may consist of KPP, 

antibodies to KPP, and mimetics, agonists, antagonists, or inhibitors of KPP. 

In various embodiments, the compositions described herein, such as pharmaceutical 

compositions, may be administered by any number of routes including, but not limited to, oral, 
20 intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, pulmonary, 

transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 
Compositions for pulmonary administration may be prepared in liquid or dry powder form. 

Hiese compositions are generally aerosolized immediately prior to inhalation by the patient. In the 

case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol deliveiy of fast- 
25 acting formulations is well-known in the art In the case of macromolecules (e.g. larger peptides and 

proteins), recent developments in the field of pulmonary delivery via the alveolar region of the lung 

have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, J.S. 

et al., U.S. Patent No. 5,997,848). Pulmonary delivery allows administration without needle injection, 

and obviates the need for potentially toxic penetration enhancers. 
30 Compositions suitable for use in the invention include compositions wherein the active 

ingredients are contained in an effective amount to achieve the intended purpose. The determination 

of an effective dose is well within the capability of those skilled in the art 

Specialized forms of compositions may be prepared for direct intracellular delivery of 

macromolecules comprising KPP or fragments thereof. For example, liposome preparations 
35 containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of the 
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macromolecule. Alternatively. KPP or a fragment thereof may be joined to a short cationic N- 
tenninal portion from the HIV Tat-1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:1569-1572). 
5 For any compound, the therapeutically effective dose can be estimated initially eidier in cell 

cultuie assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, monkeys, 
or pigs. An animal model may also be used to detemiine the appropriate concentration range and 
route of administration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

10 A therapeutically effective dose refers to that amount of active ingredient, for example KPP 

or fragments thereof, antibodies of KPP, and agonists, antagonists or inhibitors, of KPP, which 
ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be detmnined by 
standard pharmaceutical procedures in cell cultures or widi experimental animals, such as by 
calculating tiie ED^o (the dose therapeutically effective in 50% of the population) or ID50 (*e dose 

15 lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 

therapeutic index, which can be expressed as tiie LD50/ED50 ratio. Compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 
used to formulate a range of dosage for human use. The dosa^ contained in such compositions is 
preferably within a range of circulating^concentrations that includes die ED50 with little or no toxicity. 

20 The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 
patient, and the route of administration. 

The exact dosage will.be determined by the practitioner, in light of factors related to the 
subject requiring treatment. Dosage and administration are adjusted to provide sufficient level&of the 
active moiety or to maintain the desired effect Factors which may be taken into account include the 

,25 severity of the disease state, the general health of the subject, the age, weight, and gender of the 

subject, time and frequency of administration, drug combination(s), reaction sensitivities, and response 
to therapy. Long-acting compositions may be administered every 3 to 4 days, every week, or 
biweekly depending on the half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about O.I fig to 100,000 /ig, up to a total dose of 

30 about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art 
Those skilled in the art will employ different formulations for nucleotides than for proteins or dieir 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, ete. 
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DIAGNOSTICS 

In another embodiment, antibodies which specifically bind KPP may be used for the diagnosis 
of disorders characterized by expression of KPP, or in assays to monitor patients being treated with 
KPP or agonists, antagonists, or inhibitors of KPP. Antibodies useful for diagnostic purposes may be 
5 prepared in the same manner as described above for therapeutics. Diagnostic assays for KPP include 
methods which utilize the antibody and a label to detect KPP in human body fluids or in extracts of 
cells or tissues. The antibodies may be used with or wiAout modification, and may be labeled by 
covalent or non-covalent attachment of a reporter molecule. A wide variety of reporter molecules, 
several of which are described above, are known in the art and may be used. 
10 A variety of protocols for measuring KPP, including EUSAs, RIAs, and FACS, are known in 

tfie art and provide a basis for diagnosing altered or abnormal levels of KPP expression. Nonnal or 
standard values for KPP expression are established by combining body fluids or cell extracts taken 
from normal mammalian subjects, for example, human subjects, with antibodies to KPP under 
conditions suitable for complex fomiation. The amount of standard complex formation may be 
IS quantitated by various mettiods, such as photometric means. Quantities of KPP expressed in subject, 
control, and disease samples from biopsied tissues are compared with the standard values. Deviation 
between standard and subject values establishes the parameters for diagnosing disease. 

In another embodiment of the invention, polynucleotides encoding KPP may be used for 
diagnostic purposes. The polynucleotides which may be used include oligonucleotides, complementary 
20 RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect and quantify gene 
expression in biopsied tissues in which expression of KPP may be correlated with disease. The 
diagnostic assay may be used to determine absence, presence, and excess expression of KPP, and to 
monitor regulation of KPP levels during therapeutic intervention. 

In one aspect, hybridization with PGR probes which are capable of detecting polynucleotides, 
25 including genomic sequences, encoding KPP or closely related molecules may be used to identify 
nucleic acid sequences which encode KPP. The specificity of the probe, whether it is made from a 
highly specific region, e.g., the 5' regulatory region, or from a less specific region, e.g., a conserved 
motif, and the stringency of the hybridization or amplification will determine whether ttie probe 
identifies only naturally occurring sequences encoding KPP, allelic variants, or related sequences. 
30 Probes may also be used for the detection of related sequences, and may have at least 50% 

sequence identity to any of the KPP encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO: 16-30 or from 
genomic sequences including promoters, enhancers, and introns of the KPP gene. 

Means for producing specific hybridization probes for polynucleotides encoding KPP include 
35 the cloning of polynucleotides encoding KPP or KPP derivatives into vectors for the production of 
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mRNA probes. Such vectors aie known in the art, are commercially available, and may be used to 
synAesize RNA probes in vitro by means of the addition of the appropriate RNA polymerases and 
the appropriate labeled nucleotides. Hybridization probes may be labeled by a variety of reporter 
groups, for example, by radionuclides such as ^ or ^S, or by enzymatic labels, such as alkaline 
S phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

Polynucleotides encoding KPP may be used for the diagnosis of disorders associated with 
expression of KPP. Exanqyles of such disorders include, but are not limited to, a cardiovascular 
disease such as arteriovenous fistula, atherosclerosis, hypertension, vasculitis, Raynaud^s disease, 
aneuiysms, arterial dissections, varicose veins, thrombophlebitis and phlebothrombosis, vascular 
10 tumors, and complications of thrombolysis, balloon angioplasty, vascular replacement, and coronary 
artery bypass graft surgery, congestive heart failure, ischemic heart disease, angina pectoris, 
myocardial infucdon, hypertensive heart disease, degenerative valvular heart disease, calcific aortic 
valve stenosis, congenitally bicuspid aortic valve, mitral annular calcification, mitral valve prolapse, 
rheumatic fever and rheumatic heart disease, infective endocarditis, nonbacterial thrombotic 
15 endocarditis, endocarditis of systemic lupus erythematosus, carcinoid heart disease, cardiomyopathy, 
myocarditis, pCTicarditis, neoplastic heart disease, congenital heart disease, and complications of 
cardiac transplantation, congenital lung anomalies, atelectasis, pulmonary congestion and edema, 
pulmonary embolism, pulmonary hemorrhage, pulmonary infarction, pulmonary hypertension, vascular 
sclerosis, obstructive pulmonary disease, restrictive pulmonary disease, chronic obstructive pulmonary 
20 disease, emphysema, chronic bronchitis, bronchial asthma, bronchiectasis, bacterial pneumonia, viral 
and mycoplasmal pneumonia, lung abscess, pulmonary tuberculosis, diffuse interstitial diseases, 
pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, 
hypersensitivity pneumonitis, pulmonary eosinophilia bronchiolitis obliterans-organizing pneumonia, 
diffuse pulmonary hemorrhage syndromes, Goodpasture's syndromes, idiopatiiic pulmonary 
25 hemosiderosis, pulmonary involvement in collagen-vascular disorders, pulmonary alveolar proteinosis, 
lung tumors, inflanmiatory and noninflanmiatory pleural effusions, pneumothorax, pleural tumors, drog- 
hiduced lung disease, radiation-induced lung disease, and complications of lung transplantation; an 
immune system disorder such as acquired inmiunodefiiciency syndrome (AIDS), Addison's disease, 
adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, 
30 atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune 

polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact 
dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic 
lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 
glomerulonephritis, Goodpasture's syndrome, gout. Graves* disease, Hashimoto's thyroiditis, 
35 hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
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pericardial inflammation, osteoarttiritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 
syndrome, rheumatoid arthritis, scleroderma, Sj5gren*s syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, WCTner 
syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, 

S fungal, parasitic, protozoal, and helminthic infections, and trauma; a neurolo^cal disorder such as 
epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's 
disease, Huntington's disease, dementia, Paikinson's disease and otfier extrapyramidal disorders, 
amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, 
retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial 

10 and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 

thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases including 
kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal familial 
insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, tuberous 
sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation 

15 and other developmental disorders of the central nervous system including Down syndrome, cerebral 
palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve disorders, spinal 
cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral nervous system 
disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathies, 
myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, and schizophrenic 

20 disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, 
tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, Tourette's disorder, 
progressive supranuclear palsy, corticobasal degeneration, and familial frontotemporal dementia; a 
disorder affecting growth and development such as actinic keratosis, arteriosclerosis, atherosclerosis, 
bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal 

25 nocturnal hemoglobinuria, polycydiemia vera, psoriasis, primary thrombocythemia, renal tubular 
acidosis, anemia. Gushing" s syndrome, achondroplastic dwarfism, Duchenne and Becker muscular 
dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms* tumor, aniridia, genitourinary 
abnormalities, and mental retardation), Smith-Magenis syndrome, myelodyspiastic syndrome, 
hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary neuropathies such as 

30 Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders 
such as Syndenham's chorea and cerebral palsy, spina bifida, anencephaly, craniorachischisis, 
congenital glaucoma, cataract, and sensorineural hearing loss; a lipid disorder such as fatty liver, 
cholestasis, primary biliary cirrhosis, carnitine deficiency, carnitine palmitoyltransferase deficiency, 
myoadenylate deaminase deficiency, hypertriglyceridemia, lipid storage disorders such Fabry's 

35 disease, Gaucher's disease, Niemann-Pick's disease, metachromatic leukodystrophy. 
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adrenoleukodystrophy, GM2 gangliosidosis, and ceroid lipofuscinosis, abetalipopioteinemia, Tangier 
disease, hyperlipoproteineniia, diabetes mellitus, lipodystrophy, lipomatoses, acute panniculitis, 

disseminatBd fat necrosis, adiposis dolorosa, lipoid adrenal hyperplasia, minimal change disease, 

> 

lipomas, atherosclerosis, hypercholest^lemia, hypercholesterolemia with hypertriglyceridemia, 
5 primary hypoalphalipoproteinemia, hypofliyroidism, renal disease, liver disease, lecithinxholesterol 
acyltransferase deficiency, cerebrotendinous xanAomatosis, sitosterolemia, hypocholesteiolemia, Tay- 
Sachs disease, Sandhoffs disease, hyperlipidemia, hyperlipemia, lipid myopathies, and obesity; and a 
cell prolif^tive disorder such as actinic keratosis, ait^osclerosis, atfierosclerosis, bursitis, cirrhosis, 
hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal 

10 hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including 
adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in 
particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, colon, gall 
bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovaiy, pancreas, parathyroid, 
penis, prostate, salivary gjands, skin, spleen, testis, diymus, thyroid, uterus, leukemias such as multiple 

15 myeloma, and lymphomas such as Hodgfcin's disease. Polynucleotides encoding KPP may be used in 
Soutiiem or ncnAem analysis, dot blot, or other mraabrane-based technologies; in PGR technologies; in 
dipstick, pin, and multiformat ELISA-like assays; and in microarrays utilizing fluids or tissues from 
patients to detect altered KPP expression. Such qualitative or quantitative methods are well known in 
die art 

20 In a particular embodiment, polynucleotides encoding KPP may be used in assays that detect 

the i»resence of associated disorders, particularly tiiose mentioned above. Polynucleotides 
complementary to sequences encoding KPP may be labeled by standard methods and added to a fluid 
or tissue sample from a patient under conditions suitable for the formation of hybridization complexes. 

After a suitable incubation period, the sample is washed and the signal is quantified and compared with 

t 

25 a standard value. If the amount of signal in the patient sample is significantiy altered in comparison to 
a control sample then the presence of altered levels of polynucleotides encoding KPP in die sample 
indicates the presence of the associated disorder. Such assays may also be used to evaluate the 
efHcacy of a particular therapeutic treatment iregimen "in animial studies, in clinical dials, c«r to moititor 
die treatment of an individual patient. 

30 In order to provide a basis for the diagnosis of a disorder associated with expression of KPP, 

a normal or standard profile for expression is established. This may be accomplished by combining 
body fluids or cell extracts taken from normal subjects, either animal or human, with a sequence, or a 
fragment thereof, encoding KPP, under conditions suitable for hybridization or amplification. Standard 
hybridization may be quantified by comparing the values obtained from normal subjects with values 

35 from an experiment in which a known amount of a substantially purified polynucleotide is used. 
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Standard values obtained in this msamer may be compaied widi values obtained from samples from 
patients who are symptomatic for a disorder. Deviation from standard values is used to establish the 
presence of a disorder. 

Once the presence of a disorder is established and a treatment protocol is initiated, 
S hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
days to months. 

With respect to cancer, the presence of an abnormal amount of transcript (either under- or 
10 overexpressed) in bic^sied tissue from an individual may indicate a predisposition for ttie development 
of the disease, or may provide a means for detecting the disease prior to the appearance of actual 
clinical symptoms. A more definitive diagnosis of this type may allow health professionals to employ 
preventative measures or aggressive treatment earlier, thereby preventing the develcipment or further 
progression of die cancer. 

IS Additional diagnostic uses for oligonucleotides designed from the sequences encoding KPP 

may involve the use of PCR. These oligomers may be chemically synthesized, generated 
enzymatically, or produced in vitro. Oligomers will preferably contain a fragment of a polynucleotide 
encoding KPP, or a firagment of a polynucleotide conqplementary to the polynucleotide encoding KPP, 
and will be employed under optimized conditions for identification of a specific ^ne or condition. 

20 Oligomers may also be employed under less stringent conditions for detection or quantification of 
closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from polynucleotides encoding KPP 
may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions and 
deletions that are a frequent cause of inherited or acquired genetic disease in humans. Metiiods of 

25 SNP detection include, but are not Umited to, single-stranded conformation polymorphism (SSCP) and 
fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived fiom polynucleotides 
encoding KPP are used to amplify DNA using the polymerase chain reaction (PCR). The DNA may 
be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the like. 
SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in 

30 single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing 
gels. Jn fSCCP, die oligonucleotide primers are fluorescentiy labeled, which allows detection of the 
amplimers in hig^-throughput equipment such as DNA sequencing machines. Additionally, sequence 
database analysis mediods, termed in silico SNP (isSNP), are capable of identifying polymorphisms by 
comparing the sequence of individual overlapping DNA fragments which assemble into a common 

35 consensus sequence. Tliese computer-based methods filter out sequence variations due to laboratory 



PF^1724P 

preparation of DNA and sequencing errors using statistical models and automated analyses of DNA 
sequence chromatograms. In the alternative, SNPs may be detected and characterized by mass 
spectrometcy using, for example, (he high ttiroughput MASSARRAY system (Sequenom, Inc., San 
Diego CA). 

S SNPs may be used to study the genetic basis of human disease. For example, at least 16 

common SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also 
useful for examining diffimnces in disease outcomes in monogenic disorders, such as cystic fibrosis, 
sickle cell anemia, or chronic granulomatous disease. For example, variants in the mannose-binding 
lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic 

10 fibrosis. SNPs also have utility in pharmacogenomics, the identification of genetic variants that 

influence a patient's response to a drug, such as life-tiiTBatening toxicity. For example, a variation in 
N-acetyl transferase is associated witii a high incidence of peripheral neuropatiiy in response to the 
anti-tuberculosis drug isoniazid, while a variation in the core promoter of the ALOXS gene results in 
diminished clinical response to treatment with an anti-asthma drug diat targets the 5-lipoxygenase 

IS patiiway. Analysis of the distribution of SNI^ in different populations is useful for investigating 
genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations 
and ttieir migrations (Taylor, J.G. et al. (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. and Z. Gu 
(1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr. Opin. Neurobiol. 11:637-641). 
Methods which may also be used to quantify the expression of KPP include radiolabeling or 

20 biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 

standard curves (Melby, P.C. et al. (1993) J. Immunol. Mefliods 159:235-244; Duplaa, C. et al. (1993) 
Anal. Biochem. 212:229-236). The speed of quantitation of multiple samples may be accelerated by 
running the assay in a hi^-throughput format where the oligomer or polynucleotide of interest is 
presented in various dilutions and a spectrophotometric or colorimetric response gives rapid 

25 quantitation. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 
polynucleotides described herein may be used as elements on a microarray. The microarray can be 
used in transcript imaging techniques which monitor the relative expression levels of lirge numbers of 
genes simultaneously as described below. The microarray may also be used to identify genetic 

30 variants, mutations, and polymorphisms. This information may be used to determine gene function, to 
understand the genetic basis of a disorder, to diagnose a disorder, to monitor progression/regression of 
disease as a function of gene expression, and to develop and monitor the activities of tiierapeutic 
agents in the treatment of disease. In particular, diis information may be used to develop a 
pharmacogenomic profile of a patient in order to select the most appropriate and effective treatment 

35 regimen for that patient For example, therapeutic agents which are highly effective and display the 
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fewest side effects may be selected for a patient based on his/her pharmacogenomic profile. 

In another embodiment, KPP, fragments of KPP» or antibodies specific for KPP may be osed 
as elements on a microarray. The micioarray may be used to monitor or measure {HDtein-protein 
interactions, drug-target interactions, and gene expression profiles, as described above. 

S A particular embodiment relates to the use of the polynucleotides of die present invention to 

^nerate a transcript image of a tissue or cell type. A transcript ima^ represents the global pattern of 
gene expression by a particular tissue or cell type. Global gene exjoession patterns are analyzed by 
quantifying the number of expressed genes and their relative abundance under given conditions and at 
a given time (Seilhamer et al., '^Comparative Gene Transcript Analysis/' U.S. Patent No. 5,840,484; 

10 hereby expressly incorporated by reference herein). Hius a transcript image may be generated by 
hybridizing the polynucleotides of die present invention or their complements to the totality of 
transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
hybridization takes place in high-throughput format, wherein die polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 

IS resultant transcript image would provide a profile of gene activity. 

Transcript images may be generated using transcripts isolated firom tissues, cell lines, biopsies, 
or other biological samples. The transcript image may thus reflect gene expression in vivo, as in the 
case of a tissue or biopsy sample, or in vitro, as in the case of a cell line. 

Transcript images which profile the expression of the polynucleotides of the present invention 

20 may also be used in conjunction with in vitro model systems and preclinical evaluation of 

pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental 
compounds. All compounds induce characteristic gene expression pattems, frequentiy termed 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and toxicity 
(Nuwaysir, E.F. et al. (1999) MoL Carcinog. 24:153-159; Steiner, S. and Nl.. Anderson (2000) 

25 Toxicol. Lett 1 12-1 13:467-471). If a test compound has a signature similar to that of a compound 
with known toxicity, it is likely to share those toxic properties. These fingerprints or signatures are 
most useful and refined when they contain expression information from a large number of genes and 
gene families. Ideally, a genome-wide measurement of expression provides the highest quality 
signature. Even genes whose expression is not altered by any tested compounds are important as 

30 • well, as the levels of expression of these genes are used to normalize the rest of the expression data. 
Hie normalization procedure is useful for comparison of expression data af^r treatment with different 
compounds. While the assignment of gene function to elements of a toxicant signature aids in 
interpretation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical 
matching of signatures which leaids to prediction of toxicity (see, for example, Press Release 00-02 

35 firom the National Institute of Environmental Health Sciences, released February 29, 2000, available at 
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niehs.nih.gov/oc/news/toxchip.htni). Theiefoie, it is important and desirable in toxicological screening 
using toxicant signatures to include all expressed gene sequences. 

In an CTibodiment» the toxicity of a test compound can be assessed by treating a biological 
sample containing nucldc acids with die test compound. Nucleic acids that are expressed in die 
S treated biological sample are hybridized with one or more probes specific to the polynucleotides of die 
present invention, so diat transcript levels conresponding to the polynucleotides of the present invention 
may be quantified. The transcript levels in the treated biological sample are compared with levels in 
an untreated biological sample. DifTerences in the transcript levels between the two samples are 
indicative of a toxic response caused by the test compound in the treated sample. 

10 Another embodiment relates to the use of the polypeptides disclosed herein to analyze the 

proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression 
in a particular tissue or cell type. Each protein component of a proteome can be subjected individually 
to furdier analysis. Ptoteome expression patterns, or profiles, are analyzed by quantifying the number 
. of expressed proteins and dieir relative abundance under given conditions and at a given time. A • 

15 profile of a cell's proteome may thus be generated by separating and analyzing the polypeptides of a 
particular tissue or cell type. In one embodiment, the separation is achieved using two-dimensional gel 
electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first 
dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis 
in the second dimension (Steiner and Anderson, supra). The proteins are visualized in the gel as 

20 discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie 
Blue or silver or fluorescent stains. The optical density of each protein spot is generally proportional to 
the level of the protein in die sample. The optical densities of equi valently positioned protein spots 
from diffmnt samples, for example, from biological samples eith^ treated or untreated with a test 
compound or therapeutic agent, are compared to identify any changes in protein spot density related to 

25 the treatment. The proteins in the spots are partially sequenced using, for example, standard methods 
employing chemical or enzymatic cleavage followed by mass spectrometry. Hie identity of the protein 
in a spot may be determined by comparing its partial sequence, preferably of at least 5 contiguous 
amino acid residues, to the polypeptide sequences ojf inter^t In some cases, jfiurther sequence data 
may be obtained for definitive protein identification. 

30 A proteomic profile may also be generated using antibodies specific for KPP to quantify the 

levels of KPP expression. In one embodiment, the antibodies are used as elements on a microarray, 
and protein expression levels are quantified by contacting the microarray with the sample and 
detecting the levels of protein bound to each array element (Lucking, A. et al. (1999) Anal. Biochem. 
270:103-111; Mendoze, L.O. et al. (1999) Biotechniques 27:778-788). Detection may be performed by 

35 a variety of methods known in flie art, for example, by reacting die proteins in the sample witfi a thiol- 
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or amino-reactive fluorescent compoond and detecting the amount of fluorescence bound at each 
airay element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
should be analyzed in paraUel with toxicant signatures at the transcript level. Tliere is a poor 
conelation between transcript and protein abundances for some proteins in some tissues (Anderson, 
N.L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatikres may be 
useful in the analysis of compounds which do not significantly affect the transcript image, but which 
alter die proteomic profile. In addition, die analysis of transcripts in body fluids is difficult, due to rapid 
degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases. 

Li anotfier embodiment, die toxicity of a test compound is assessed by treating a biological 
sample containing proteins witfi die test compound. Protems that are expressed in die treated 
biological sample are separated so diat die amount of each protein can be quantified. Tlie amount of 
each protein is compared to die amount of die corresponding protem in an untreated biological sample. 
A difference in die amount of protem between die two samples is indicative of a toxic response to the 
test compound in die treated sample. Individual proteins are identified by sequencmg die amino acid- 
residues of die individual proteins and comparing diese partial sequences to die polypeptides of die 
present invention. 

In anodier embodiment, die toxicity of a test compound is assessed by beating a biological 
sample containing proteins widi die test compound. Proteins from die biological sample are incubated 
widi antibodies specific to die polypeptides of die presem invention. Hie amount of protein recognized 
by die antibodies is quantified. Theamountof protein in die tieated biological sample is compared 
witii die amount m an untreated biological sample. A difference in die amount of protein between die 
two samples is indicative of a toxic re^onse to die test compound in die tieated sample. 

Microarrays may be prepared, used, and analyzed using mediods known m die art (Brennan. 
T.M. et al. (1995) U.S. Patent No. 5.474,796; Schena. M. et al. (1996) Proc. Nad. Acad. Sci. USA 
93:10614-10619; Baldeschweileret al. (1995) PCT application W095/25116; Shalon, D. et al. (1995) 
PCT application WO95/35505: Heller, R.A. et al. (1997) Proc. Nafl. Acad. Sci. USA 94:2150-2155; 
HellerrMJ:etal. (1997) UTS.-Patent Not 5,605,662).- Various typw of microarraVs are well known ' 
and dioroughly described in Schena, M.. ed. (1999; DNA Mlc»,«n-«v«- A P».nri..l Ap p.»,.K Oxford 
University Press, London). 

In anodier embodiment of die invention, nucleic acid sequences encoding KPP may be used to 
generate hybridization probes usefiil in mapping die naturally occurring genomic sequence. Eidier 
coding or noncodmg sequences may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. Fbr example, conservation of a coding sequence among membere 
of a multi-gene family may potentially cause undesiied cross hybridization during chromosomal 
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mapping. Hie sequences may be .mapped to a particular chromosome, to a specific region of a 
chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 
constmctions, or single chromosome cDNA libraries (Harrington, J. J. et al. (1997) Nat Genet 15:345- 

5 355; Price, CM. (1993) Blood Rev. 7:127-134; Trask, BJ. (1991) Tiends Genet 7:149-154). Once 
mapped, the nucleic acid sequences may be used to develop ^netic linkage maps, for exan4>le, which 
conelate the inh^tance of a disease state wiA the inheritance of a particular chromosome region or 
restriction fragment lengdi polymorphism (RFLP) (Lander, E.S. and D. Botstein (1986) Proc. Nad. 
Acad. Sci. USA 83:7353-7357). 

10 Fluorescent in situ hybridization OFISH) may be correlated with other physical and genetic 

map data (Heinz-Ulrich, et al. (1995) in Meyers, Siq>ra, pp. 965-968). Examples of ^netic map data 
can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) 
World Wide Web site. Correlation between the location of the ^ne encoding KPP on a physical map 
and a specific disorder, or a predisposition to a specific disorder, may help define the region of DNA 

IS associa^ with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as 
linkage analysis using established chromosomal markers, may be used for extending genetic maps. 
Often the placement of a gene on the chromosome of anoth^ mammalian species, such as mouse, 
may reveal associa^ markers even if the exact chromosomal locus is not known. This information is 

20 valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 
localized by genetic linkage to a particular genomic region, e.g., ataxia*telangiectasia to 1 lq22-23, any 
sequences mapping to that area may represent associated or regulatory genes for further investigation 
(Gatti, R.A. et al. (1988) Nature 336:577-580). Hie nucleotide sequence of the instant invention may 

25 also be used to detect differences in the chromosomal location due to translocation, inversion, etc., 
among normal, carrier, or affected individuals. 

In another embodiment of die invention, KPP, its catalytic or immunogenic fragments, or 
- . oligopeptides thereof-can be used for-screenuig-libraries of~compounds in any of a varieQf of drug 
screening techniques. The fragment employed in such screening may be free in solution, affixed to a 

30 solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 
between KPP and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to tiie protein of interest (Geysen, et al. (1984) PCT application 
W084yQ3S64). In this method, large numbers of different small test conq>ounds are synthesized on a 

35 solid substrate. The test compounds are reacted with KPP, or fragments tiiereof, and washed. 
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Bound KPP is then detected by methods weU known in the art Purified KPP can also be coated 
directly onto plates for use in the af<»ementioned drug screening techniques. Alternatively, 
non-neutralizing antibodies can be used to capture the peptide and immobilize it on a solid support 

In another embodiment, one may use conqietitive drug screening assays in which neutralizing 
antibodies capable of binding KPP specifically compete with a test compound for binding KPP. Jn this 
manner, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with KPP. 

Jn additional CTibodiments, the nucleotide sequences which encode KPP may be used in any 
molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are currently known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair interactions. 

Without fiurther elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize Ifae present invention to its fullest extent Hie following embodiments are, therefore, 
to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way 
whatsoever. 

The disclosures of all patents, applications and publications, mentioned above and below, are 
expressly incorporated by refiraence hmin. 

EXAMPLES 

I. Constraction of cDNA Libraries 

Incyte cDNAs are derived from cDNA libraries described in the LIFESEQ database (Incyte, 
Palo Alto CA) and shown in Table 4, column 3. Some tissues are homogenized and lysed in 
guanidinium isodiiocyanate, while others are homogenized and lysed in phenol or in a suitable mixture 
of denaturants, such as TRI2DL (bivitrogen), a monophasic solution of phenol and guanidine 
isodiiocyanate. The resulting lysates are centrifuged over CsCl cushions or extracted with 
chloroform. RNA is precipitated fi-om the lysates with eiih^ isqpropanol or sodium acetate and 
ethanol, or by other routine methods. 

- " ^-Phenol extractioit and*precipitation of RNAlare repeated as necessary to increase KNA 
purity. In some cases, RNA is treated widi DNase. For most libraries, poly(AH RNA is isolated 
using oligo d(T)-coupIed paramagnetic particles (Promega), OUGOTEX latex particles (QLVGEN, 
Chatsworth CA), or an OLIGOTEX mKNA purification kit (QIAGEN). Alternatively, RNA is 
isolated directly fi^om tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA 
purification kit (Ambion, Austin TX). 

In some cases, Stratagene is provided with RNA and constructs the corresponding cDNA 
libraries. Otherwise, cDNA is synthesized and cDNA libraries are constructed with the UNIZAP 
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vector system (Stratagene) or SUPERSCRIPT plasmid system (Invitiogen), using the lecommended 
procedures or similar methods known in the art (Ausubel et ai., supra^ ch. 5). Reverse transcription is 
initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters are ligated to double 
stranded cDNA, and the cDNA is digested with the appropriate restriction enzyme or enzymes. For 
S most libraries, the cDNA is size-selected (300-1000 bp) using SEPHACRYL SIOOO, SEPHAROSE 
CL2B, or SEPHAROSE CL4B column chromatography (Amersham Biosciences) or preparative 
agarose gel electrophoresis. cDNAs are ligated into compatible restriction enzyme sites of the 
polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORTl plasmid 
(Invitrogen, Carlsbad CA), PCDNA2.1 plasmid (Invitiogen), PBK-CMV plasmid (Stratagene), PCR2- 
10 TOPOTA plasmid (bivitrogen), PCMV-ICIS plasmid (Stmtagene), pIGEN (Incyte, Palo Alto CA), 
pRARE (Incyte), or pINCY (Incyte), or derivatives thereof. Recombinant plasmids are transformed 
into competent E. coli cells including XLUBlue, XLl-BlueMRF, or SOLR from Stratagene or DHSo, 
DHIOB, or ElectroMAX DHIOB from Invitio^n. 

15 IL Isolation of cDNA Clones 

Plasmids obtained as described in Example I are recovmd from host cells by in vivo excision 
using the UNIZAP vector system (Stratagene) or by cell lysis.. Plasmids are purified using at least 
one of the following: a Magic or WIZARD Minipreps DNA purification system (Prome^); an AGTC 
Miniprep purification kit CBdge Biosystems, Gailfaersburg MD); and QIAWELL 8 Plasmid, 

20 QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.RA JL. PREP 

96 plasmid purification kit from QIAGEN. Following precipitation, plasmids are resuspended in 0. 1 ml 
of distilled water and stored, with or without lyophilization, at 4''C. 

Altmiatively, plasmid DNA is amplified from host cell lysates using direct link PCR in a high- 
throughput format (Rao, V.B. (1994) Anal. Biochem. 216:1-14). Host cell lysis and thermal cycling 

25 steps are carried out in a single reaction mixture. Samples are processed and stored in 384-well 
plates, and tiie concentration of amplified plasmid DNA is quantified fluwometrically using 
PICOGREEN dye (Molecular Ptobes, Eugene OR) and a FLUOROSKAN II fluorescence scanner 

---. - (LabsystemsOy, Helsinki, Finland).- - - ~ . r ' ^ 

III. Sequencing and Analysis 

30 Incyte cDNA recovered in plasmids as described in Example n are sequenced as follows. 

Sequencing reactions are processed using standard metiiods or high-throughput instrumentation such 
as the ABI CATALYST 800 (Applied Biosystems) tiiermal cycler or the PTC-200 thenhal cycler 
(MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions are piepaied using 

35 reagents provided by Amersham Biosciences or supplied in ABI sequencing kits such as the ABI 
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PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 
Electrophoietic separation of cDNA sequencing reactions and detection of labeled polynucleotides are 
carried out using the MEGABACE 1000 DNA sequencing system (Amersham Biosciences); the ABI 
PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI 
5 protocols and base calling software; or other sequence analysis systems known in the art. Reading 
frames within the cDNA sequences are identified using standard methods (Ausubel et al., supr^i, ch. 
7). Some of the cDNA sequences arc selected for extension using the techniques disclosed in 
Example Vm. 

Polynucleotide sequences derived from bicyte cDNAs are validated by removing vector, 

10 linker, and poly(A) sequences and by nmsking ambiguous bases, using algorithms and programs based 
on BLAST, dynamic programming, and dinucleotide nearest nei^bor analysis. The Incyte cDNA 
sequences or translations thereof are then queried against a selection of public databases such as the 
GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS, 
DOMO, PRODOM; PROTEOME databases witii sequences from Homo scq^iens, Rattus 

15 norvegioiSj Mus musadus^ Caenorhabditis elegans, Saccharomyces cerevisiae^ 

Schizosojccharomyces pombe^ and Candida albicans (Incyte, Palo Alto CA); hidden Markov model 
OH[MM>based protein family databases such as PFAM, INCY, and TIGRFAM (Haft, D JI. et al. 
(2001) Nucleic Acids Res. 29:41-43); and HMM-based protein domain databases such as SMART 
(Schultz, J. et al. (1998) Ptoc. Natl. Acad. Sci. USA 95:5857-5864; Letunic, I. et al. (2002) Nucleic 

20 Acids Res. 30:242-244). (HMM is a probabilistic approach which analyzes consensus primary 

stroctures of gene families; see, for example, Eddy, S.R. (1996) Cuir. Opin. Struct. Biol. 6:361-365.) 
The queries ate performed using programs based on BLAST, FASTA, BLIMPS, and HMMER. The 
Incyte cDNA sequences are assembled to produce full length polynucleotide sequences. 
Altanadvely, GenBank cDNAs, GenBank ESTs, stitched sequences, stretched sequences, or 

25 Genscan-predicted coding sequences (see Examples IV and V) are used to extend Incyte cDNA 
assCTiblages to full length. Assembly is performed using programs based on Phied, Phrap, and 
Consed, and cDNA assemblages are screened for open reading frames using programs based on 

GeneMaric; BLASTrand FASTA.-The full lengdipdlyhuclebtide Sequences are^translat^ to derive 

the corresponding full lengtfi polypeptide sequences. Alternatively, a polypeptide may begin at any of 

30 . the methionine residues of the fiill length translated polypeptide. Full lengdi polypeptide sequences are 
subsequendy analyzed by querying against databases such as die GenBank protein databases 
(genpept), SwissProt, the PROTEOME databases, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, 
hidden Markov model (HMM)-based protein family databases such as PFAM, INCY, and 
TIGRFAM; and HMM-based protein domain databases such as SMART. Full length polynucleotide 

35 sequences are also analyzed using MACDNASIS PRO software OMEiraiBio, Alameda CA) and 
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LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence alignments are y 
gen^ated using default parameters specified by the CLUSTAL algorithm as incorporated into the 
MEGALIGN multisequence alignment program (DNASTAR), which also calculates the percent 
identity between aligned sequences. 

5 Table 7 summarizes tools, programs, and algorithms used for the analysis and assembly of 

bicyte cDNA and fiiU length sequences and provides applicable descriptions, refanences, and 
threshold parameters. The furst column of Table 7 shows die tools, programs, and algorithms used, the 
second column provides brief descriptions thereof, the third column presents appropriate references, 
all of which are incorporated by reference herein in their entirety, and the fourdi column presents, 

10 where applicable, tiie scores, probability values, and other parameters used to evaluate the strength of 
a match between two sequences (the higher the score or the lower die probability value, the greater 
the identity between two sequences). 

The programs described above for the assembly and analysis of full lengtii polynucleotide and 
polypeptide sequences are also used to identify polynucleotide sequence fragments from SEQ ID 

IS NO: 16-30. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization and 
amplification technologies are described in Table 4, column 2. 
IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative kinases and phosphatases are initially identified by running the Genscan gene 
identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is 

20 a general-purpose gene identification program which analyzes genomic DNA sequences from a 

variety of organisms (Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94; Burge, C. and S. Karlm 
(1998) Curr. Opin. Struct Biol. 8:346^-354). The program concatenates predicted exons to form an 
assembled cDNA sequence extending from a methionine to a stop codon. The output of Genscan is a 
FASTA database of polynucleotide and polypeptide sequences. Hie maximum range of sequence for 

25 Genscan to analyze at once is set to 30 kb. To determine which of these Genscan predicted cDNA 
sequences encode kinases and phosphatases, die encoded polypeptides are analyzed by querying 
against PFAM models for kinases and phosphatases. Potential kinases and phosphatases are also 
— identified by'homoIogy^oihicyte'cDNA''sequences~that'have been annotated'as kinases and ~ 
phosphatases. These selected Genscan-predicted sequences are then compared by BLAST analysis 

30 to the genpept and gbpri public databases. Where necessary, die Genscan-predicted sequences are 
then edited by comparison to the top BLAST hit from genpept to correct errors in the sequence 
predicted by Genscan, such as extra or omitted exons. BLAST analysis is also used to find any Incy te 
cDNA or public cDNA coverage of the Genscan-predicted sequences, thus providing evidence for 
transcription. When Incyte cDNA coverage is available, this information is used to correct or confirm 

35 the Genscan predicted sequence. Full length polynucleotide sequences are obtained by assembling 
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Genscan-predicted coding sequences with Incyte cDNA sequences and/or public cDNA sequences 
using the assembly process described hi Example DL Altemadvely, fuU lengtfi polynucleotide 
sequences are derived entirely from edited or unedited Genscan-predicted coding sequences. 
V. Assembly of Genomic Sequence Data with cDNA Sequence Data 

5 "Stitched'* Sequences 

Partial cDNA sequences are extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
m are mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 
exon predictions from one or more, gnomic sequences. Each cluster is analyzed using an algorithm 

10 based on graph theory and dynamic programming to integrate cDNA and genomic information, 

generating possible splice variants that are subsequendy confirmed, edited, or extended to create a frill 
length sequence. Sequence luteals in which the enthe length of the interval is present on more than 
one sequence in the cluster are' identified, and intervals thus identified are considered to be equivalent 
by transitivity. For example, if an interval is present on a cDNA and two genomic sequences, then all 

15 three intervals are considered to be equivalent. This process allows unrelated but consecutive 

genomic sequences to be brought together, bridged by cDNA sequence. Intervals thus identified are 
then "stitehed" together by the stitehing algoritiun in the order that they appear along their parent 
sequences to generate the longest possible sequence, as well as sequence variants. Linkages between 
intervals which proceed along one type of parent sequence (cDNA to cDNA or genomic sequence to 

20 genomic sequence) are given preference over linkages which change parent type (cDNA to genomic 
sequence). Hie resultant stitehed sequences are translated and compared by BLAST analysis to ttie 
genpept and gbpri public databases. Incorrect exons predicted by Genscan are corrected by 
comparison to the top BLAST hit frx>m genpept. Sequences are further extended with additional 
cDNA sequences, or by inspection of genomic DNA, when necessary. 

25 **Stretchcd*' Sequences 

Partial DNA sequences are extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example m are queried against public 
~ databases suctf as'^tiicGeidBank primateTitxiemrmajimfaliiSnrv^^ databases 
using the BLAST program. The nearest GenBank protein homolog is then compared by BLAST 

30 analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein is generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 
may occur in the chimeric protein with respect to the original GenBank protein homolog. The 
GenBank protein homolog, the chimeric protein, or both are used as probes to search for homologous 

35 genomic sequences from the public human genome databases. Partial DNA sequences are therefore 
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"stietchetf ' or extended by the addition of homologous genomic sequences. The resultant stretched 
sequences are examined to determine whether they contain a complete gpne. 
VI. Chromosomal Mapping of KPP Encoding Polynucleotides 

The sequences used to assemble SEQ ID NO:16-30 are compared with sequences from die 
Incyte UFESEQ database and public domain databases using BLAST and other implementations of 
the Smith-Waterman algorithm. Sequences from these databases that matched SEQ ID NO:16-30 
are assembled into clusters of contiguous and overlapping sequences using assembly algorithms such 
as Phrap (Table 7). Radiation hybrid and genetic mapping data available from public resources such 
as the Stanford Human Genome Center (SHGC). Whitehead Institute for Genome Research (WIGR). 
and G6n6thon are used to determine if any of die clustered sequences have been previously mapped. 
Inclusion of a mapped sequence in a cluster results in the assignment of 01 sequences of that cluster, 
including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to ti»e terminus of die chromosome's p- 
15 arm. (Tlie centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA m 
humans, although tiiis can vary widely due to hot and cold spots of recombination.) The cM distances 
are based on genetic markers mapped by G6n6thon which provide boundaries for radiation hybrid 
markers whose sequences were included in each of tiie clusters; Human genome mkps and otfier 
20 resources available to die pubUc. such as tfie NCBI "GeneMap'99" World Wide Web site 

(ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified disease genes map 
witiiin or in proximity to the intervals indicated above. 
Vn. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect die presence of a transcript of a 
25 gene and involves tfie hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound (Sambrook and Russell, supra, ch. 7; Ausubel et 
al., supra, ch. 4). 

Analogous compul^r'techniques apflyhigBLASTaiS iSs^'tb ^iTforTd^ticia or related 
molecules in databases such as OenBank or UFESEQ Gncyte). Hiis analysis is much faster dum 
30 multiple membrane-based hybridizations. In addition, tiie sensitivity of tt»e computer search can be 
modified to detennine whetiier any particular match is categorized as exact or similar. The basis of 
the search is die product score, which is defined as: 
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BLAST Score x Percent Identity 

S X mimmum {length(Seq. 1), leiigth(Seq. 2)} 

The product score takes into account both the degree of similarity between two sequences and die 

5 length of the sequence match. Tlie product score is a normalized value between 0 and 100, and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 
product is divided by (S times the length of the shorter of the two sequences). The BLAST score is 
calculated by assigning a score of +S for every base diat matches in a high-scoring segment pair 
CHSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 

10 gaps). If there is more tfian one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score rq>resents a balance between fractional overlap and quality in a 
BLAST alignment For example^ a product score of 100 is {Mroduced only for 100% identity over the 
entire lengdi of the shorter of die two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overiap at the 

15 other. A product score of SO is produced either by 100% identity and 50% overlap at one end, or 79% 
identity and 100% ov^lap. 

Alternatively, polynucleotides encoding KPP are analyzed with respect to the tissue sources 
from which diey are derived. For example, some full length sequences are assembled, at least in part, 
with overlappmg Incyte cDNA sequences (see Example ID). Each cDNA sequence is derived from 

20 a cDNA^libraiy constructed from a human tissue. Each human tissue is classified into one of the 
following organ/tissue categories: cardiovascular system; connective tissue; di^tive system; 
embiyonic structures; endocrine system; exocrine glands; ^nitalia, female; genitalia, male; germ cells; 
hemic and immune system; liver, musculoskeletal system; nervous system; pancreas; respiratory 
system; sense organs; skin; stomatognatfaic system; unclassified/mixed; or urinary tract. The number 

25 of libraries in each category is counted and divided by the total number of libraries across all 
categories. Similarly, each human tissue is classified into one of the following disease/condition 
categories: cancer, cell line, developmental, inflanunation, neurological, trauma, cardiovascular, pooled, 

- - and otiier, and the number of libraries in each cate^ry isicoimtedllndliividell'by'thel^ number of 
libraries across all categories. The resulting percentages reflect the tissue- and disease-specific 

30 expression of cDNA encoding KPP. cDNA sequences and cDNA library/tissue information are 
found m the LIFESEQ database (Incyte, Palo Alto CA). 
VIII. Extension of KPP Encoding Polynucleotides 

Full length polynucleotides are produced by extension of an appropriate fragment of the fiill 
length molecule using oligonucleotide primers designed ftom this fragment One primer is synthesized 

35 to initiate S* extension of the known fragment, and the other primer is synthesized to initiate 3' 
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extension of the known fragment The initial primers aie designed using OUGO 4.06 software 
(National BiosciencesX or another appropriate program, to be about 22 to 30 nucleotides in lengtti, to 
have a GC content of about 50% or more, and to anneal to the tar^t sequence at temperatures of 
about 68 to about 72*^C. Any stretch of nucleotides which would result in haiipin structures and 

5 primer-primer dimerizations is avoided. 

Selected human cDNA libraries are used to extend the sequence. If more than one extension 
is necessary or desired, additional or nested sets of primers are designed. 

High fidelity amplification is obtained by PGR using metfiods well known in the art PGR is 
performed in 96-welI plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction mix 

10 contains DNA template, 200 nmol of each primer, reaction buffer containing Mg^, QIH^SO^ and 2- 
mercaptoethanol, Taq DNA polymerase (Amersham Biosciences), ELONG ASE enzyme (Invitrogen), 
and Pfii DNA polymerase (Stratagene), with the following parameters for primer pair PCI A and PGI 
B: Stsp 1: 94''G, 3 min; Step 2: 94°C, 15 sec; Step 3: eO'^C, 1 min; Step 4: 68''C, 2 min; Step 5: Steps 
2, 3, and 4 i^)eated 20 times; Step 6: SS^'C, 5 min; Step 7: storage at 4°G. In the alternative, the 

15 parameters for primer pair 17 and SK+ are as follows! Step li 94^ C, 3 min; Step 2! 94^G, IS sec; 
Step 3: ST'^G, 1 min; Step 4: eS^'C, 2 min; Stqp S: Steps 2, 3, and 4 r^eated 20 times; Step 6: eS^'C, 5 
min; Step 7: storage at 4*'G. 

The concentration of DNA in each well is determined by dispensing 100 jul PICOGREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 

20 and 0.S ill of undiluted PGR product into each well of an opaque fluorimeter plate (Coming Gostar, 
Acton MA), allowing the DNA to bind to the reagent. The plate is scanned in a Fluoroskan n 
(Labsystems Oy, Helsinki, Hnland) to measure the fluorescence of the sample and to quantify ttie 
concentration of DNA. A S jul to 10 /zl aliquot of the reaction mixture is analyzed by electcpphoresis 
on a 1 % agarose gel to determine which reactions are successful in extending the sequence. 

25 The extended nucleotides are desalted and concentrated, transferred to 384-well plates, 

digested with CvUI cholera virus endonuclease Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to reiigation into pUC 18 vector (Amersham Biosciences). For shotgun 
- .T - sequencing, the digested nucleotides are separated-on'low-concentration-(0:6-to-0:8%) agarose^ 

fragments are excised, and agar digested with Agar AGE (Promega). Extended clones are religated 

30 using T4 ligase (New England Biolabs, Beverly MA) into pUG 18 vector (Amersham Biosciences), 
treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transfected 
into competent £. coli cells. Transformed cells are selected on antibiotic-containing media, and 
individual colonies are picked and cultured overnight at 37 ""C in 384-well plates in LB/2x carb liquid 
media. 

35 The ceils are lysed, and DNA is amplified by PGR using Taq DNA polymerase (Amersham 
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Biosciences) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1: 94''C 3 
min; 2: 94''C IS sec; Step 3: 60''C, 1 min; Step 4: Va^'C, 2 min; Step 5: steps 2» 3, and 4 repeated 
29 times; Step 6: 72''C, 5 min; Step 7: storage at 4''C DNA is quantified by PICOGREEN reagent 
Molecular Probes) as described above* Samples with low DNA recoveries are reamplified using the 

5 same conditions as described above. Samples axe diluted with 20% dimethysulfoxide (1:2, v/v), and 
sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit 
(Amersham Biosciences) or the ABI PRISM BIGDYB Terminator cycle sequencing ready reaction 
kit (Applied Biosystems). 

In like manner, full lengdi polynucleotides are verified using the above procedure or are used 

10 to obtain S' regulatory sequences using the above procedure along with oligonucleotides designed for 
such extension, and an appropriate genomic library. 

IX. Identificatf on of Sii^e Nucleotide Polymoiphisms in KPP Encoding Polynudeotiides 
Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) are 
identified in SEQ ID NO: 16-30 using the UFESEQ database (Incyte). Sequences from the same 

IS gene are clust^ned together and assembled as described in Example m, allowing the identification of 
all sequence variants in the gene. An algorithm consisting of a series of filters is used to distinguish 
SNPs from oth^ sequence variants. Preliminary filters remove the majority of basecall errors by 
requiring a minimum Phied quality scoxe of IS, and remove sequence alignment eirors and errors 
resulting from improper trimming of vector sequences, chimeras, and splice variants. An automated 

20 procedure of advanced chromosome analysis is applied to the original chromatogram files in the 

vicinity of the putative SNP. Clone error filters use statistically generated algorithms to identify errors 
introduced during laboratory processing, such as those caused by reverse transcriptase, polymerase, or 
somatic mutation. Clustering error filters use statistically generated algorithms to identify errors 
resulting from clustering of close homologjs or pseudogenes, or due to contamination by non-human 

25 sequences. A final set of filters removes duplicates and SNI^ found in immunoglobulins or T-cell 
receptors. 

Certain SNPs are selected for further characterization by mass spectrometry using the high 
-throughput MASSARRAY system (Sequenomi-IncO'to analyze-allele-fiequencies-aMfae-SNP-sites in 
four different human populations. The Caucasian population comprises 92 individuals (46 male, 46 

30 female), including 83 from Utah, four French, three Venezualan, and two Amish individuals. The 
African population comprises 194 individuals (97 male, 97 female), all African Americans. The 
Hispanic population comprises 324 individuals (162 male, 162 female), all Mexican Hispanic. The 
Asian population comprises 126 individuals (64 male, 62 female) with a reported parental breakdown 
of 43% Chinese, 31% Japanese, 13% Korean, S% Vietnamese, and 8% other Asian. Allele 

35 frequencies are first analyzed in the Caucasian population; in some cases those SNPs which show no 
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allelic variance in this population are not further tested in the other three populations. 
X. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO: 16-30 are employed to screen cDNAs, 
genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 

S pairs, is specifically described, essentially the same procedure is used with larger nucleotide 
firagments. Oligonucleotides are designed using state-of-tiie-art software such as OLJGO 4.06 
software (National Biosciences) and labeled by combining SO pmol of each oligomer, 250 ^Ci of 
[y-^P] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide kinase O^uPont NEN, 
Boston MA). The labeled oligonucleotides are substantially purified usmg a SEPHADEX G-2S 

10 superfine size exclusion dextran bead colrnnn (Amersham Biosciences). An aliquot containing 10^ 
counts per minute of the labeled probe is used in a ^ical membrane-based hybridization analysis of 
human genomic DNA digested with one of the following endonucleases: Ase I, B^ n, Eco RI, Pst I, 
Xba, I, or Pvu n (DuPont NEN). 

The DNA from each di^t is fractionated on a 0.7% agarose gel and transferred to 

15 NYTRAN PLUS nylon membranes (Schleicher & Schuell, Duriiam NH). Hybridization is carried out 
for 16 hours at 40''C. To remove nonspecific signals, blots are sequentially washed at room 
temperature under conditions of up to, for example, 0.1 x saline sodium citrate and 0.5% sodium 
dodecyl sulfate. Hybridization patterns are visualized using autoradiography or an alternative imaging 
means and compared. 

20 XI. Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 
photolithography, piezoelectric printing (ink-jet printing; see, e.g., Baldeschweiler et al., supra), 
mechanical microspotting technologies, and dmvatives tfamof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non*porous surfoce (Schena, M., ed. 

25 (1999) DNA Microam ys: A Practical Approach , Oxford University Press, Tendon). Sugg^ted 

substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a procedure 
analogous to a dot or slot blot may also be used to arrange and link elements to the surface of a 
' -substrate using tiiermal, UV, chemical, mechanical bonding procedur^sr A't^caPafray may be 
produced using available metiiods and machines well known to those of ordinary skill in the art and 

30 may contain any appropriate number of elements (Schena, M. et al. (1995) Science 270:467-470; 

Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat Biotechnol. 
16:27-31). 

Full length cDNAs, Expressed Sequence Tags (ESTs), or firagments or oligomers tiiereof may 
comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
35 selected using software well known in the art such as LASERGENE software (DNASTAR). The 
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array elements are hybridized with pdynucleotides in a biological sample. The polynucleotides in the 
biological sample are conjugated to a fluorescent hibel or other molecular tag for ease of detection. 
After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scanner is used to detect hybridization at each array element Alternatively, laser 
5 desoibtion and mass spectrometry may be used for detection of hybridization. Tbo degree of 

complementarity and the relative abundance of each polynucleotide which hybridizes to an element on 
the microarray may be assessed. In one embodiment, microarray preparation and usage is described 
in detail below. 

Tissue or Cell Sample Preparation 
10 Total RNA is isolated from tissue samples using the guanidinium Aiocyanate method and 

poly(A)* RNA is purified using the oligo-(dT) ceUulose mpthod. Each poly(Ar RNA sanq»le is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg//tl oligo-(dT) primer (21mer). IX first 
strand buffer. 0.03 units/^l RNase inhibitor, 500 dATP, 500 /iM dGTP, 500 lOA dTIP, 40 ftM 
dCTP, 40 pM dCTP-Cy3 (BDS) or dCTP-Cy5 (Ametsham Biosciences). The reverse transcription 
15 reaction is performed in a 25 ml volume containing 200 ng poly(Ar RNA with GEMBRIGHT kits 
(Ihcyte). Specific control polyCA)* RNAs are synthesized by in vitro transcription from non-codmg 
yeast genomic DNA. After incubation at 37»C for 2 hr, each reaction sample (one with Cy3 and 
another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydrojdde and incubated for 20 
minutes at 850C to the stop the reaction and degrade the RNA. Samples arc purified using two 
20 successive CHROMA SPIN 30 gel filtration spin columns (BD aontech, Palo Alto CA) and after 
combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 mgtol). 60 ml 
sodium acetate, and 300 ml of 100% etfianol. The sample is then dried to completion ustag a 
SpeedVAC (Savant Instruments Inc., Holbrook NY) and resuspended in 14 Ml 5X SSaO.2% SDS. 
MicroarraY Preparation 

25 Sequences of the present invention are used to generate array elements. Each array element 

is amplified from bacterial cells containing vectors with cloned cDNA inserts. PGR amplification uses 
primers complementary to the vector sequences flanking the cDNA insert, ^y elements are 
amplified in thirty cycled of jPCRfrom an initial-q^antiti; of 1-2 ng t^k~fiiaifqiia5B^ gtbatef 'Oien 5 fig. 
Amplified array elements are tiien purified using SEPHACRYL400 (Ameisham Biosciences). 

30 Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 

slides (Coming) are cleaned by ultrasound in 0.1% SDS and acetone, witfi extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR). West Chester PA), washed extensively in distilled water, and 
coated with 0.05% aminopropyl silane (Sigma-Aldrich. St. Louis MO) in 95% eUianol. Coated slides 
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are cured in a ll(y<2 oven. 

\ 

Amy elements are applied to the coated glass substrate using a procedure described in U.S. 
Patent No. 5.807.522, incorporated herein by reference. 1 /cl of the array element DNA, at an avera^ 
concentration of 100 ng^jxl, is loaded into die open capillary printing element by a high-speed robotic 

5 apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Strata^ne). 
Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (IVopix, Inc.. Bedford MA) for 30 minutes at 60**C followed by washes in 0.2% 

10 SDS and distilled water as before. 
Hvbridizatton 

Hybridization reactions contain 9 fil of sample mixture consisting of 0.2 fig each of Cy3 and 
Cy5 labeled cDNA synttiesis products in 5X SSC. 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65*> C for 5 minutes and is aliquoted onto ttie microarray surface and covered widi 

15 an 1 .8 cm* coverslip. The arrays are transferred to a waterproof chamber having a cavity just sKghtiy 
larger than a microscope slide. The chambw is kept at 100% humidity internally by die addition of 140 
III of 5X SSC in a comer of the chamber. The chamber containing the arrays is incubated for about 
6.5 hours at 60°C. The arrays are washed for 10 min at 4S^C in a first wash buffer (IX SSC, 0.1% 
SDS). three times for 10 minutes each at 45*'C in a second wash buffer (O.IX SSC), and dried. 

20 Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of ^nmting spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 
focused on die array using a 20X microscope objective (Nikon, Inc.. Melville NY). The slide 

25 contmning the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed giCs multiline lisiser'exciteis the'two fluorophores'sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 

30 Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 
filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of ttie fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
aldiou^ the apparatus is capable of recording the spectra from both fluorophores simultaneously. 

35 The sensitivity of the scans is typically calibrated using the signal intensity generated by a 
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cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DN A sequence, allowing Ae intensity of the signal at that location 
to be correlated witfi a weight ratio of hybridizing species of 1: 100,000. When two samples from 
different sources {e.g., representing test and control cells), each labeled witii a different fluorophore, 
5 are hybridized to a single array for the purpose of identifying ^nes that are differentially expressed, 
the calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and 
adding identical amounts of each to the hybridization mixture. 

The ouqiut of flie photomultipKer tube is digitized using a 12-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-compatible PC 
10 computer. The digitized data are displayed as an image where the signal intensi^ is mapped using a 
linear 20-color transformation to a pseudocolcn: scale ranging from blue Oow signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overl^>ping emission 
spectra) between the fluorophores using each fluorophore's emission spectrum. 
15 A grid is superimposed over the fluorescence signal image such that the signal from each spot 

is centered m each element of the grid. The fluorescence signal within each element is then integrated 
to obtain a numerical value corresponding to the average intensity of tiie signal. The software used 
for signal analysis is the GEMTOOLS gene expression analysis program (bicyte). Array elements 
that exhibit at least about a two-fold change in expression, a signal-to-background ratio of at least 
20 about 2.5, and an element spot size of at least about 40%, are considered to be differentially 
expressed 
Expression 

For example, expression of SEQ ID NO: 18 was downregulated in brain tissue affected by 
Alzheimer^s Disease versus normal brain tissue as determined by microarray analysis. Specific 

25 dissected brain regions from the brain patients wifli AD were compared to dissected re^ons from 

normal brain. The diagnosis of normal or AD was estabUshed by a certified neuropathologist based on 
microscopic examination of multiple sections throu^out ttie brain. Expression of SEQ ID NO:18 was 
decreased at least two-fold in 7 of 10 AD-affectea tissue' sanaplesT TfierefofeTin vmoiw^ " 
embodiments, SEQ ID NO:18 can be used for one or more of the following: i) monitoring treatment of 

30 Alzheimer's Disease, ii) diagnostic assays for Alzheimer^s Disease, and iii) developing therapeutics 
and/or other treatments for Alzheimer's Disease as determined by microarray analysis. ^ 

As anotfier example, SEQ ID NO:16 and SEQ ID NO:18 were downregulated in breast 
cancer cells versus nonmalignant mammary epithelial cells, as determined by microarray analysis. 
Cell lines compared included: a) MCF-lOA, a breast mammary gland (luminal ductal characteristics) 
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cell Hne isolated from a 36-yearU>ld woman with fibrocystic breast disease, b) MCF7. a nonmalignant 
breast adenocarcinoma ceU line isolated from the pleural effusion of a 69-year-old female, c) BT-20, a 
breast carcinoma cell Une derived wi vftro from the cells emigrating out of thin sUces of tumor mass 
isolated from a 74-year-old female, d) T-47D, a breast carcinoma cell line isolated from a pleural 
5 effusion obtained from a 54-year-old female with an infiltrating ductal carcinoma of the breast, e) Sk- 
BR-3, a breast adenocarcinoma ceU line isolated from a malignant pleural effusion of a 43-year-old 
female, f) MDA-mb-231. a breast tumor ceU line isolated firom die pleural effusion of a 51-year-old 
female, g) MDA-mb^35S. a spindle-shaped strain fliat evolved from the parent line (435) isolated by 
R. CaiUeau from pleural effusion of a 31-year-old female witti metastatic, ductal adenocarcinoma of 
10 die breast, and h) HMEC a primary breast epithelial ceU line isolated from a normal donor. Expression 
of SEQ ID NO:16 was decreased at least two-fold in die Sk-BR-3. BT-20. MDA-mb^35S. T-47D. 
and MCF7 ceU Imes as compared to the normal breast epithelial cells. Expression of SEQ ID NO:18 
was decreased at least two-fold in die MCF-IOA. T-47D. Sk-BR-3. and MCF7 cell lines as compared 
to the nonnal teeast ei»ttielial ceUs. Therefore, in various embodmients. SEQ ID NO:16 and SEQ ID 
15 NO:18 can be used for one or more of die foUowing: i) monitoring treatment of breast cancer, ii) 

diagnostic assays fbr breast cancer, and iii) developmg tfierapeutics and/or ottier treatments for breast 
cancer as detennined by microarray analysis. 

As anotiier example. SEQ ID NO:18 and SEQ ID NO:21 showed tissue-specific expression 
as detennined by microarray analysis. RNA samples isolated from a variety of normal human tissues 
were compared to a common reference sample. Tissues contributing to die reference sample were 
selected for their abihty to provide a complete distribution of RNA m die human body and include brain 
(4%). heart (7%). kidney (3%). lung (8%), placenta (46%). smaB intestine (9%). spleen (3%), stomach 
(6%), testis (9%), and uterus (5%). The normal tissues assayed were obtained firom at least tiirce 
different donors. RNA from each donor was separately isolated and individually hybridized to die 
25 microarray. Since these hybridization experiments were conducted using a common reference 
sample, differential expression values are direcdy comparable ftom one tissue to anodier. The 
expression of SEQ ID NO:18 was increased by at least two-fold in brain cortex tissue as compared to 
" die reference sample. Therefore. SEQ ID n6:18 can be osea'as a-tissue inarfer fSr-brain cortex 

tissue. The expression of SEQ ID NO:21 was inaeased by at least two-fold in heart tissue as 
30 compared to the reference sample. Therefore. SEQ ID NO:21 can be used as a tissue marker for 

heart tissue. 

XII. Complementary Polynucleotides 

Sequences complementary to the KPP-encoding sequences, or any parts thereof, are used to 
detect, decrease, or inhibit expression of naturally occurring KPP. Although use of oligonucleotides 
comprising from about 15 to 30 base pairs is described, essentially the same procedure is used with 
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smaller or with largpr sequence fragments. Appropriate oligonucleotides are designed using OUGO 
4.06 software (National Biosciences) and the coding sequence of KPP. To inhibit transcription, a 
complementary oligonucleotide is designed ftom the most unique 5' sequence and used to prevent 
promoter binding to the coding sequence. To inhibit translation, a complementary oligonucleotide is 
S designed to prevent ribosomalbhjding to the KPP-encodmg transcript 

Xm. ExpresaonofKPP 

Expression and purification of KPP is achieved usmg bacterial or virus-based expression 
systems. For expression of KPP in bacteria, cDNA is subcloned into an appropriate vector containing 
an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription. 
10 Examples of such promoters include, but are not limited to. the trp-lac itac) hybrid promoter and the 
T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element. 
Recombinant vectors are transformed into suitable bactraial hosts. e.g.. BL21(PE3). Antibiotic 
resistant bacteria express KPP upon taduction with isopropyl beta-D-thiogilactopyranoside (IPTG). 
Expression of KPP in eukaryotic ceUs is achieved by infecting msect or mammalian ceU lines with 
15 recombinant Autographica califomica nuclear polyhedrosis virus (AcMNPV). commonly knowr* as 
baculovirus. The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding KPP 
by either homologous recombination or bacterial-mediated transposition involving transfer plasmid 
intermediates. Viral infectivity is mamtained and die strong polyhedrin promoter drives high levels of 
cDNA transcription. Recombhiant baculovmis is used to infisct Spodopterafrugiperda (Sf9) insect 
20 cellsinmostcases,orhumanhepatocytes.insomecases. mfectionoftfie latter requires additional 
genetic modifications to baculovirus (Engelhaid, E.K. et al. (1994) Proc. Nafl, Acad. Sci. USA 
91:3224-3227; Sandig. V. et al. (1996) Hum. Gene Ther. 7:1937-1945). 

In most expression systems, KPP is synthesized as a fiision protein witii. e.g., glutatiiione S- 
transf erase (GST) or a peptide epitope tag. such as FLAG or 6.His. permitting rapid, single-step. 
25 affinity-based purification of recombmant fusion protein fiom crude cell lysates. GST. a 26.kilodalton 
enzyme from Schistosoma japonicum, enables die purification of fusion proteins on immobUized 
glutathione under conditions tiiat maintahi protein activity and antigenicity (Amersham^iosciences). 
Following purification, the GST moiety'csin be prJte^lytically cleaVedlcorn KPPaT^ifically 
engineered sites. FLAG, an 8-ammo acid peptide, enables immunoaffinity purification using 
30 commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6-His. a 
stretch of six consecutive histidine residues, enables purification on metal-chelate resins (QIAGEN). 
Methods for protein expression and purification are discussed in Ausubel et al. {supra, ch. 10 and 16). 
Purified KPP obtained by these methods can be used directiy in tiie assays shown m Examples XVII. 
XVra. XDC, XX. and XXI, where applicable. 
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XIV • Functional Assays 

KPP function is assessed by expressing the sequences encoding KFP at physiologically 
elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 

5 include PCMV'SPORT plasmid (Invitrogen, Carlsbad CA) and PCR3.1 plasmid (Invitrogen), both of 
which contain die cytomegalovirus promoter. S-10 Mg of recombinant vector are transiently 
transfected into a human cell line, for example, an endodieHal or hematopoietic cell line, using eitfier 
liposome formulations or electioporation. 1-2 of an additional plasmid containing sequences 
encoding a mark^ protein are co-transfected. Expression of a marker protein provides a means to 

10 distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression 
from the recombinant vector. Marker proteins of choice include, e.g.. Green Fluorescent Protein 
(GFP; BD Clontech), CD64, or a CD64-GFP frision protein. Flow cytometry (PCM), an automated, 
laser optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to 
evaluate the appptotic state of the cells and odier cellular properties. FCM detects and quantifies the 

IS uptake of fluorescent molecules diat diagnose events preceding or coincident widi cell death. These 
events include changes in nuclear DN A content as measured by staining of DNA with prq[>idium 
iodide; chants in cell size and granularity as measured by forward light scatter and 90 degree side 
li^t scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine 
uptake; alterations in expression of cell surface and intracellular proteins as measured by reactivity 

20 with specific antibodies; and alt^tions in plasma membrane composition as measured by the binding 
of fluorescein-conjugated Annexin V protein to die cell surface. Methods in flow cytometry are 
discussed in Ormerod, M.G. (1994; Plow Cvtometrv. Oxford, New York NY), 

The influence of KPP on gene expression can be assessed using highly purified populations of 
cells transfected with sequences encoding KPP and either CD64 or CD64-GFP. CX)64 and CD64- 

25 GFP are expressed on the surface of transfected cells and bind to conserved regions of human 

immunoglobulin G (IgG). Transfected cells are efficientiy separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 Q^YNAL, Lake Success 
- - NY). mRNA can be purified from die cells using^ethods well-known by'tfaose~^of sU^^ 

Expression of mRNA encoding KPP and other genes of interest can be analyzed by northern analysis 

30 or microarray techniques. 

XV. Production of KPP Specific Antibodies 

KPP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M .G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols. 

35 Alternatively, the KPP amino acid sequence is analyzed using LASERGENE software 
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(DNASTAR) to deteraiine regions of high inmnmogBnicity. and a coiresponding oligopeptide is 
synthesized and used to raise antibodies by means known to ttiose of skill in the art. Methods for 
selection of appropriate epitopes, such as those near the C-tenninus or in hydiophilic regions are well 
described in the art (Ausubel et al., supra, ch. 11). 
5 TVpically, oligopeptides of about 15 residues in length are synthesized using an ABI 431A 

peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldrich, St Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 
increase immunogenicity (Ausubel et al.. supra). Rabbits are immunized with the oligppeptide-KlH 
complex in complete Fteund's adjuvant. Resulting antisera are tested for antipeptide and anti-KPP 
10 activity by, for example, bmding die peptide or KPP to a substrate, blocking witfi 1% BSA. reacting 
witii rabbit s^tisera, washing, and reacting widi radio-iodinated goat anti-rabbit IgG- 
XVI. Purification of NaturaUy Occurring KPP Using Specific Antibodies 

Naturally occurring or recombinant KPP is substantially purified by immunoaffinity 
chromatography using antibodies specific for KPP. An hmnunoaffinity colmnii is constructed by 
15 covalentiy coupling anti-KPP antibody to an activated chromatographic resin, such as CNBr-activated 
SEPHAROSE (Amersham Biosciences). After die coupling, die resin is blocked and washed 
according to the manufacturer's instructions. 

Media containing KPP are passed over die immunoaffinity column, and die column is washed 
under conditions that allow die preferential absorbance of KPP (e.g.. high ionic strengdi buffers in die 
presence of detergent). Hie column is eluted under conditions duit disrupt antibody/KPP bindmg (e.g.. 
a buffer of pH 2 to pH 3. or a high concentration of a chaotrope. such as urea or tiiiocyanate ion), and 
KPP is collected. 

XVII. Identification of Molecules Which Interact with KPP 

KPP. or biologically active fragments diereof, are labeled witti '«I Bolton-Hunter reagpnt 
25 (Bolton. A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539). Candidate molecules previously 
arrayed in the wells of a multi-well plate are incubated witfi die labeled KPP. washed, and any weUs 
with labeled KPP complex are assayed. Data obtained using different concentotions of KW are 
" ' used to calculate values for the number, affini^.'and ^s-oc"iati5ii 5f KPP wife die candrdate molecules. 

Alternatively, molecules interacting widi KPP are analyzed using die yeast two-hybrid system 
as described in Fields, S. and O. Song (1989; Natui4 340:245-246). or using commercially available 
kits based on die two-hybrid system, such as die MATCHMAKER system (BD aontech). 

KPP may also be used in die PATHCALUNG process (CuraGen Corp., New Haven CT) 
which employs tiie yeast two-hybrid system in a high-diroughput manner to determine all interactions 
between die proteins encoded by two large libraries of genes (Nandabalan. K. et al. (2000) U.S. 
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Patent No. 6,057,101). 

XVnL Dmonstratioii of KFP Activity 

OeneraUy. protein Idnase activity is measured by quantl^ng the phosphorylation of a protdn 
substrate by KPP in the presence of Iy-«P] ATP. KPP is incubated with the protein substrate. 
5 »P.ATP. and an appropriate kinase buffer. Ihe «P mcorporated into Ae substrate is separated from 
free «P-ATP by electrophoresis and the mcorporated »P is counted using a radioisotope counter, 
•nie amount of incorporated «P is proportional to the activity of KPP. A determination of the specific 
amino acid residue phosphorylated is made by phosphoamino acid analysis of the hydrolyzed protem. 
In one alternative, protein kinase activity is measured by quantifying the transfer of gamma 
10 phosphate from adenosine triphosphate (ATP) to a serine, threonine or tyrosine residue in a proteui 
substrate. Hie reaction occurs between a protefai kinase sample with a Wotinylated peptide substrate 
and gamma »p.ATP. FbUowing the reaction, free avidin in solution is added for binding to the 
biotinylated '^-peptide product. The binding sample then undergoes a centrifugal ultrafiltration 
process with a membrane which will retain the product-avidin complex and aUow passage of free 
15 gamma ^^-ATP. Hie reservofa- of the centrifuged unit containing the »P-peptide product as letentate 
is then counted in a scintillation counter. This procedure allows die assay of any type of protein kinase 
sample, depending on the peptide substrate and kinase reaction buffer selected. TOs assay is provided 
in kit form (ASUA, Affinity Ultrafiltration Separation Assay. Transbio Corporation, Baltimore MD, 
U.S. Patent No. 5,869,275). Suggested substrates and tfieur respective enzymes mclude but are not 
limited to: HistoneHl (Sigma) and p34'*^nase, Anne«n I, Angiotensin (Sigma) and EOF receptor 
kinase, Annexin H and src kinase. ERKl & ERK2 substrates and MEK, and myelin basic protein and 
ERK (Pearson, J.D. et al. (1991) Metiiods Enzymol. 200:62-81). 

In another alternative, protein kinase activity of KPP is demonstrated in an assay containing 
KPP, 50 \il of kinase buffer, 1 jig substrate, such as myeUn basic protein (MBP) or syntiietic peptide 
25 substrates, 1 mM EOT, 10 ^g ATP, and 0.5 [Y-^P]ATP. Ilie reaction is incubated at 30«C for 
30 minutes and stopped by pipetting onto P81 paper. The unincorporated [y-'^JATP is removed by 
washing and the incorporated radioactivity is measured using a scintiUationcounto^^^ tiie 
reaction is stopped by heating to 100°C in die presence of SDS loading buffer and r;^lved a 12% 
SDS polyaciylamide gel followed by autoradiography. The amount of incorporated ~P is proportional 

30 to the activity of KPP. 

In yet another alternative, adenylate kinase or guanylate kinase activity of KPP may be 
measured by the incorporation of »^P from (y-«PlATP into ADP or GDP using a gamma radioisotope 
counter. KPP. in a kinase buffer, is incubated togetiier witii the appropriate nucleotide 
mono-phosphate substrate (AMP or GMP) and "P-labeled ATP as flte phosphate donor. The 
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leaction is incubated at 37»C and tenninated by addition of trichloroacetic acid- The acid extract is 
neutraUzed and subjected to gpl electtophoresis to separate the mono-, di-, and triphosphonucleotide 
fractions. The dqihosphonucleotide fraction is excised and counted. The radioactivity recovered is 

proportional to the activity of KPP. 
5 In yet another alternative, other assays for KPP mclude scintillation proximity assays (SPA), 

scintiUation plate technology and filter btoding assays. Useful substrates mclude recombinant proteins 
tagged with glutathione transferase, or synthetic peptide substrates tagged with biotin. Inhibitors of 
KPP activity, such as small organic molecules, proteins or peptides, may be identified by such assays. 
In another alternative, phosphatase activity of KPP is measured by the hydrolysis of para- 
10 nitrophenyl phosphate (PNPP). KPP is mcubated together witii PNPP in HEPES buffer pH 7.5, in 
die presence of 0.1% p-mercaptoethanol at yj^C for 60 min. The reaction is stopped by die addition 
of 6 ml of 10 N NaOH (Diamond, R.H. et al. (1994) Mol. CeU. Biol. 14:3752-62). Alternatively, acid 
phosphatase activity of KPP is demonstrated by incubating KPP-contaming extract witf» 100 ill of 10 
mM PNPP in 0.1 M sodium citrate. pH 4.5, and 50 ^tl of 40 mM NaCl at 37»C for 20 min. Hie 
15 reaction is stopped by die addition of 0.5 ml of 0.4 M glycine^aOH. pH 10.4 (Saffig, P. et al. (1997) 
J,, Biol. Chem. 272:18628-18635). Hie factease in light absoibance at 410 nm resulting ftom die 
hydrolysis of PNPP is measured using a spectrophotometer. The increase in light absorbance is 
propcnticnal to the activity of KPP in die assay. 

In tite alternative. KPP activity is determined by measuring die amount of phosphate removed 
20 from a phosphorylated protein substrate. Reactions are performed widi 2 or 4 nM KPP in a final 

volume of 30 ^1 containing 60 mM TWs, pH 7.6. 1 mM BDTA, 1 mM EGTA. 0.1% p-mereaptoedianol 
and 10 nM substrate. »P-labeled on serinemueonine or tyrosine, as appropriate. Reactions are 
initiated witfi substrate and incubated at 30° C for 10-15 mm. Reactions are quenched widi 450 /tl of 
4% (w/v) activated charcoal m 0.6 M HCl. 90 mM Na^PjO,, and 2 mM NaHjPO*. dien centrifuged 
25 at 12,000 X g for 5 min. Acid-soluble »Pi is quantified by Uquid scintillation counting (Sinclair, C. et al. 
(1999) J. Biol. Chem. 274:23666-23672). 

XIX- Kinase Binding Assay 

Binding of ia»P to a FLAG-Ca>44 cyt ftsiori protMn 

with anti-KPP-conjugated immunoaffinity beads followed by incubating portions of die beads (having 
30 10-20 ng of protein) widi 0.5 ml of a bmding buffer (20 mM THs-HCL (pH 7.4), 150 mM NaCl, 0.1% 
bovine serum albumin, and 0.05% TCton X-100) m die presence of «»I-labeled FLAG-CD44cyt fusion 
protein (5,000 cpm/ng protein ) at 4 "C for 5 hours. Following binding, beads were v^ashed dioroughly 
in the binding buffer and the bead-bound radioactivity measured in a scintillation counter (Bourguignon, 
L.Y.W. et al. (2001) J. Biol. Chem. 276.7327-7336). The amount of incorporated »P is proportional 
35 to the amount of bound KPP. 
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XX. Identification of KFPInUbitors 

Compoimds to be tested arc anayed in the weUs of a 384-wen plate in varying concentiations 
along with an appropriate buffer and substrate, as described m the assays in Example XVH. KPP 
activity is measured for each well and the abUity of each compound to inhibit KPP activity can be 
5 determined, as weU as the dose-response kinetics. This assay could also be used to identify molecules 
which enhance KPP activity. 
XXL Identification of KPP Substrates 

A KPP "substrate-trapping" assay takes advantage of the increased substrate affinity that 
may be conferred by certain mutations in the PTP signature sequence of protein tyrosine 
10 phosphatases. KPP bearing tiwse mutations form a stable complex with tfieir substrate; this complex 
may be isolated biochenricaUy. Site-directed mutagenesis of invariant residues in the PTP signature 
sequence in a clone encoding the catalytic domain of KPP is performed usmg a metiiod standard in 
die art or a commercial kit. such as tiie MUTA-GENE kit from BIO-RAD. For expression of KPP 
mutants in Escherichia coli, DNA fragments containing die mutation are exchanged witf» die 
15 corresponding wild-type sequence in an expression vector bearing die sequence encoding KPP or a 
glutathione S-transferase (GST>-KPP ftision protein. KPP mutants are expressed in K coli and 
purified by chromatography. ^ 

The expression vector is transfected mto CX)S1 or 293 cells via calcium phosphate-mediated 
transfection with 20 Mg of CsQ-purified DNA per lO^m dish of cells or 8 Mg per 6-cm dish. Forty- 
20 eight hours after transfection. cells are stimulated witii 100 ng/ml epidermal giowtii factor to increase 
tyrosine phosphorylation in ceUs. as die tyrosine kinase EGFR is abundant in COS ceUs. CeUs-are 
lysed in 50 mM TrisHCl. pH 7.5/5 mM EDTA/150 mM NaCl/1% Triton X-100/5 mM iodoacetic 
acid/10 mM sodium phosphale/lO mM NaF/5 ftgtod leupeptin/5 ^igtod aprotinin/1 mM benzamidine (1 
ml per 10-cm dish, 0.5 ml per 6^ dish). KPP is hnmnnoiwecipitated from lysates wifli an 
25 appropriate antibody. GST-KPP fiision proteins are precipitated witfi glutadiione-Sepharose. 4 Mg of 
nvAb or 10 fil of beads respectively per mg of ceU lysate. Complexes can be visualized by PAGE or 
further purified to identify substrate molecules (Flint. A J. et al. (1997) Pioc. Nafl. Acad. Sci. USA 
94:1680-1685). 

30 Various modifications and variations of tiie described compositions, mettiods. and systems of 

tiie invention will be apparent to those skiUed in die art wittiout departing from die scope and spirit of 
the invention. It will be appreciated diat die invention provides novel and useftil proteins, and tfieir 
encoding polynucleotides, which can be used in tiie drug discovery process, as well as mettiods for 
using tiiese compositions for die detection, diagnosis, and treatment of diseases and conditions. 

35 Altfiough die invention has been described in connection witii certain embodiments, it should be 
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understood that the invention as claimed should not be unduly limited to such specific embodiments. 
* Nor should the description of such embodiments be considered exhaustive or limit tfie invention to die 
precise fomis disclosed. Furthermore, elements ftom one embodiment can be readily recombined with 
elements from one or more other embodiments. Such combinations can form a number of 
S embodiments within the scope of the invention. It is intended that the scope of the invention be 
defined by the foUowmg claims and their equivalents. 
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'What is claimed is: 

1. An isolated polypeptide sdected £rom the group consisting of: 

a) a polypeptide conoprising an amino acid sequence selected finom the group consisting 
5 ofSEQIDNO:l-15, 

b) a polypeptide con9>ri5ing a naturally occuiring amino acid sequence at least 90% 
identical to an annno acid sequence selected from the group consisting of SEQ ED 
NO:2^, SEQ ID NO:8-13 and SEQ ID NO:15. 

c) a polypeptide comprising a naturally occurring annuo acid sequence at least 97% 
10 identical to an annno acid sequence selected from the group consisting of SEQ ID 

NO:5 and SEQ ID NO:6, 

d) a polypeptide comprising a naturally occurring amino add sequence at least 94% 
identical to an amino add sequence selected from the group consisting of SEQ ID 
NO:7andSEQIDNO:l» 

15 e) a polyp^tide consisting essentially of a naturaUy occurring andno acid sequence at 

least 90% identical to the andno add sequence of SEQ ID NO:14, 

f) ' a biolo^caUy active fragm^ of a polypeptide having an annno add sequence* 

selected from the gmup consisting of SEQ ID NO:l-lS, and 

g) animnninog^c fragment of a polypeptide having an amino add sequence selected 
20 from the group consisting of SEQ ID NO:l-15. 

2. An isolated polypeptide of claim 1 comprising an amino add sequmoe selected from the 
group consisting of SEQ ID NO:l-15. 

25 3. An isolated polynudeotide encoding a polypeptide of claim 1. 

4. An isolated polynudeotide encoding a potypeptide of claim 2. 

5. An isolated polynucleotide of cldm4jcqnq)iising,a.polynucle^ 
30 the group consisting of SEQ ID NO:16-30. 

6. A recombinant polynucleotide comgpiising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

35 7. A cell transformed with a recombinant polynucleotide of claim 6. 



100 



pp.172* P 

8. AtfaiBgea&oigaittsmccHiqirisingarecon* 



9. Anefhod of piododog a polypeptide of claim 1, the meaiod con9>rising: 

a) ddtiiriiig a cdl under cooditions suitable for expression of flie polypeptide, wbeaein 
5 said cdl is tiansfonnBd with a xeconibiiiaiit polynucleotide, and said recombinant 

polynucleotide comprises a pn«noter sequence operably linked to a polynucleotide 
encoding fba polyp^de of daim 1. and 

b) recovering the polypeptide so expressed. 

10 10. A nKfliod of daim 9, wheKanflie polypeptide conq>rises an anrino add sequence 

sdected from the group consisting of SEQ ID NO:l-15. 

1 1. An isolated antibody whid» specifically hinds to a polypeptide of claim 1. 

15 ' 12. An isolated polynucleotide sdectedfromthe group consisting of: 

a) a polynucleotide comprising a pdynndeotide sequence sdected firomflie group 
consisting of SEQ ID NO:16-30, 

b) a polynucleotide comprising a naturally occurring polynudeofide sequence at least 
90% identical to a polynudeotide sequence sdected fipom die group consisting of 

20 SEQIDNO:16-30, 

c) a polynudeotide complementary to a polynucleotide of a), 

cO a polynucleotide complemsaitary to a polynudeotide of b), and 
e) an RNA equivalent of a)-d). 

25 12. An isolated potynndeotide sdected from fliB group wmsisting of: 

a) a polyimcleotidecanf«rising a polynudeotide sequence sdected fiKHtt die group 

consistioig of SEQ ID NO:16-30. 

b) a polynudeotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynudeotide sequence sdected from die group.conristing.ot_ 

30 SEQ ID NO:16-19 and SEQ ID NO:21-25, 

c) a polynucleotide con^risiiig a mitnralty occurring polynucleotide sequence at least 
99% idaiticd to the pdtynudeotide sequraice of SEQ ID NO:29, 

d) . a pdynndeotide con^prising a naturally occurring polynucleotide sequence at least 

96% identical to the pdynndeotide sequence of SEQ ID NO:28, 
35 e) a polynudeotide comprising a naturally occurring polynudeotide sequence at least 
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93% identical to the polynucleotide sequence of SEQ ID NO20, 

f) a polynucleotide conq>rising a naturally occurring polynucleotide sequ^ice at least 
92% id^cal to the polynucleotide sequmce of SEQ ID NO:27, 

g) a polynucleotide conqprising a naturally occuniiig polynucleotide sequence at least 
91% identical to the polynucleotide sequrace of SEQ ID NO:26, 

h) a polynucleotide consisting essentially of a naturally occurring polynucleotide 
sequence at least 90% identical to the polynucleotide sequence of SEQ ID NO:30, 

i) a polynucleotide conQ)lenientary to a polynucleotide of a), 
j) a polynucleotide con^lementary to a polynucleotide of b), 
k) a polynucleotide con^lementary to a polynucleotide of c), 
1) a polynucleotide conq>lementary to a polynucleotide of d), 
m) a polynucleotide conq>len)i^taiy to a polynucleotide of eX 
n) a polynucleotide con^)lem&ntary to a polynucleotide of f), 
o) a polynucleotide conq)len)entary to a polynucleotide of g), 

p) a polynucleotide congjlementary to a polynucleotide of h), and 
q) an RNA equivalent of a)-p). 

13. An isolated polynucleotide conoprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 

14. A method of detecting a target polynucleotide in a sanqde, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method conqpiising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
compri^pg a sequmce conq)lemraitaiy to said target polynucleotide in the sanq>le, 
and which probe specifically hybridizes to said target polynucleotide, under 
conffitions whmby a hybridization complex is fbrmed between said probe and said 
target polynucleotide or firagments thereof, and 

b) detecting tiie piesence or absence of said I^ridization contain, and, optionally, if 
present, the amount thereof. ~ ^ . ... r- -r^ s rr — — 

15. A method of claim 14, wberdn the probe conq^rises at least 60 contiguous nucleotides. 

16. A method of detecting a target polynucleotide in a saniple, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the metiiod comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
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reaction anplification, and 
b) detecting Ae presence or absence of said amplified target polynucleotide or fragment 
fliereof, and, optionally, if preset, the amount fltoeof. 

17. Acompositioncomprisingapolypeptideofclaimlandaia^ 



18. A con^iosition of claim 17, wherein the polypeptide comprises an a^ 
selected fromfhe group consisting of SEQ ID NO:l-15. 

10 

19. A method for treating a disease or condition associated with decreased expression of 
functional KPP, comprising administering to a patient in need of such treatment flie composition of 
claim 17. 

15 20. A mefliod of screening a con5)ound for effectiveness as an agonist of a polypeptide of 

claim 1, the nvsthod conq>rising: 

a) contacting a sample comprising a polypeptide of claim 1 with a compound, and 

b) detecting agonist activity in the san^le. 

20 21. Acompositioncomprisinganagonistcompoundidentifiedbyamelhodofclaim20^ 

pharmaceuticaUy acceptable exdpient 

22. A mefliod for treating a disease or condition assodated with decreased expression©^ 
functional KPP, comprising administering to a patient in need of such treatment a composition of 

2S claim21. 

23. A mefliod of screening a compound fbr effectiveness as an aiiagoni^ 

claim 1 , the mefliod conq;xrising: 

a) cQDtactiiig a sample comprising apolypeptide of claim l-wifli^i compound^and . 
3Q 1)) detecting antagonist activity in the sanq>le. 

24. A composition con?)rising an antagonist compound identified by a mefliod of 
and a pharmaceuticaUy acceptable excipient 



35 



25. A mefliod for treating a disease or condition associated wifli overexpression of functional 
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KPP. con^risiDg admuristtmiig to a patient in need of sudi treatment a composition of claim24. 

26. A mBflwd of screening for a conqiouiid that specif 
1, tbe m^hod ccnnprising: 

5 a) combining Hie polypeptide of claim 1 with at least one test compound under suitable 

conditions, and 

b) detecting binding of the polypeptide of daim 1 to flie test compound, fliereby 
identi^ing a compound fliat specifically binds to the polypeptide of daim 1. 

10 27. A metiiod of screening for a compound fliat modulates die activity of die polypeptide of 

clum 1, the mediod con^rising: 

a) combining the polypeptide of claim 1 with at least one test compound under 
conditions pepassive for die activity of die polypeptide of daim 1 , 

b) assessing die activity of die polypeptide of daim 1 in die presence of the test 
15 confound, and 

c) comparing die activity of die polypeptide of daim 1 in die presence of die test 
compound widi die activity of die polypeptide of claim 1 in d» absence of the test 
compound. vAerdn a diange in die activity ofdie polypeptide of daim 1 in die 
presence of die test compound is indicative of a compound d»t modulates die activity 

20 of the polypeptide of daim 1. 

28. A mediod of screening a compound for effectiveness in altering «aipres8ion of a target 
pdynndeotide. wherein said target polynucleotide comprises a sequence of daim 5, die mediod 
compiiang: 

25 • a) contacting a saii?>lecomprisii« die target pdynudeotidewidi a con?i^^ 

conditions siutable for die expression of die target polynudeotide, 

b) detecting altned expiesaon of die target po^deotide, and 

c) conyaring die expression of die target pdynudeotide in die presence of varying 

ararants of die con^KJund and in die absaice of decompound... ■ 

30 

29. AuBdiodof screranngforpotentialtoxidtyof atestcompound, die mediod conyrising: 

a) treating a biological sample contaimng nuddc acids widi die test coiiq)ound, 

b) hybridizing die nndeic adds of die treated biological sample widi a probe comprising 
at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 

35 whereby a spedfic hybridization complex is formed between said probe and a target 
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polynucleotide in the biological sample, said target polynncleotide compiisiiig a 
polyimcleotide sequence of a polyimdeotide of daim 12 or fragmiait thaeof, 

c) quantifying the amount of hybridization conqitec. and 

d) comparing the amount ofhybridization complex in fliB treated biologic!^ 

5 the amount of hybridization complex in an untreated biological sample, wherran a 

difference m flie amount of hybridization complex in die treated biological sample 
indicates potential toxid:^ of the test confKrand. 

30. A n«rthod for a diagnostic test fbr a condition or disease assodatedvidth the expression 

10 of KPP in a biological sample,' the method c ornprising : 

a) combining fliebiologicalsan?»lewidi an antibody of daim 11. under conditions 

suitable for die antibody to bind ^ polypeptide and form an antibodyi>olypeptide 

' complex,, and 

b) detecting die con5>lex.\(*erein the presence of die complex corrdateswifli die 

j5 piesence of the polypeptide in die biolo^calsanqiile. 

31. The antibody ofdaim 11, whMdnthe antibody is: 

a) a chimeric antibody, 

b) a single diain antiboc^, 
20 c) a Fab fragmrait, 

^ a F(ab')j firagnwnt, or 
e) a humanized antibody. 
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32. A composition comprising an antibody of claim 11 and an acceptable excipienL 

33. A mefliod of diagnosing a condition or disease associated wifli die expression of KPP in 
a subject, comprising administering to said subject an effective amount of die composition of claim 
32. 

30 34. A composition of claim 32, furdiercon5)rising a labd. 

35. A mediod of diagnosing a condition or disease assodated wiflx die expression of KPP in 
a subject, comprising administering to said subject an effective amount of die composition of dium 
34. 

35 

36. Amediod of preparing a polyclonal antibody widi dw spedfid^ of die antibody of daim 
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1 1 » die method con^iising: 

a) ^mrnnti'iTlne an ammal wifli a polypeptide coosistixig of an anfino add sequence 

selected from flie group consisting of SEQ ID NO:l-15, or an immmiDgemc fragment 
thereof, und^ conditions to eSidt an antibo^ response^ 
5 -b) isolating antibodies from die animal, and 

c) scfeeniie ^ isolated antibodies widi the polyp^de, thereby identifying a 

polyclonal antibo^ which specifically binds to a polypqitide comprisuig an anmno 
acid sequence sdected from the group consisting of SEQ ID NO:l-lS. 

10 37. Apolycloiudantibo^producedby amBttiodofclaim36. 

38. A ccnnposition conoprising die polyclonal antibody of claim 37 and a siutable carrier. 



, 39. AtnethodofmaKnganwnoclonalantibody wilhfliespecifidtyof tte 

15 1 1, the method con^sing: 

a) jfntniiniTing an mtmsH With a polypeptide consisting of an amino acid sequence 
sdected from the group consisting of SEQ ID NO:l-15, or an imnBinogedc fragment 
dieieof, under conditions to elicit an antibody response, 

b) isolating antibody producing cells from the aniixKQ, 

20 c) frising the antibody producing cells with immortalized cells to form monoclonal 

antibody-producing hybiidoma cells, 

culturing the hybridoroa cells, and 
e) isolating from the culture monoclonal antibody whidi specifically binds to a 

polypeptide con^rismg an amino acid sequence selected from the group consisting of 
25 SEQIDNO:l-15. 

40. A monocloiuQ antibody produced by a method of claim 39. 

41. A con^osition comprising the noonoclonal antibody of claim 40 and a suitable carrier. 

30 

42. The antibody of claun 11, viberdn the antibody is produced by screemng a Fab 
expresidon library. 

43. The antibody of claim 1 1, wherdox the antibody is produced by screening a recombmant 
35 immunoglobulin library. 
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44. A XDBthod of detectiiig a polypeptide cooDprising an annno acid sequ^xce selected firom 
tbe group consistiiig of SEQ ID NO:l-15 in a sabople, the method comprising: 

a) incubating i3ib antibody of claim 1 1 ividi die sanqple nnd^ conditions to allow 
spedfic binding of the antibody and the polypeptide^ and 

b) detecting specijEic binding, whmin specific binding indicates the piesCTce of a 
polypeptide conqnisipg an annno add sequence selected firom the group consisting of 
SEQ ID NO:l-lS in die sanqde. 

43. Amethod of puriQ^ing a polypq[itide conqxrising an amino acid sequence selected firom 
the group consisting of SEQ ID NO: 1-15 firom a sample, die method conqizisiiig: 

a) incubating the antibody of claim 11 widi the san^le under conditions to allow 
specific binding of the antibody and the polypeptide, and 

b) separating die antibody firom the sanqple and obtaining die purified polypeptide 
coiiq>rising an anmno acid sequCTce selected firom the group coosistii^ of SEQ ID 
NO:l-15. 

46. A ndcroarray wherdn at least one dement of the microarray is a polynucleotide of claim 

13. 

47. A method of generating an expression profile of a sanq)le which contaips 
polynucleotides, the method comprising: 

a) labeling the polynucleotides of the sanq>le, 

b) contacting the elements of the nucroarray of claim 46 with the labeled 
polynucleotides of the sanq>le under conditions suitable for the formation of a 
hybridization con^lex, and 

. c) quantifying the expression of die polyimcleotides in the sample. 

48. An array comprising different nudieotide molecules afiBxed in distinct physical locations 
on a solid substrate, wherein at least one of said nucleotide molecules conprises a first 
oligonucleotide or polynucleotide sequence specifically hybridizable with'at lea^ 30 contiguous 
nucleotides of a target polynucleotide, and wh^rdn said target polynucleotide is a polynucleotide of 
claim 12. 

49. An array of claim 48, wherein said first oligonudeotide or polynucleotide sequence is 
completely conQ>lenratary to at least 30 contiguous nucleotides of said target polynucleotide. 
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50. An array of claim 48, wherein said first oligonncleotide or polynucleotide sequence is 
completely conoplem^staty to at least GO contiguous nucleotides of said target polynucleotide^ 

3 1. An array of claim 48, ^erdn said first oli^xnucleodde or polynucleotide sequence is 
5 conqdetdy conqilementaiy to said target polynucleotide. 

52. Anarray of claim48, whichis a microarray. 

53. An array of claim 48, furth^ connprising said targ^ polynucleotide hybridized to a 
10 nucleotide noolecule craiprising said first oligooucleotide or polynucleotide sequence. 

54. An array of claim 48, wherdn a linker joins at least one of S£ud nucleotide molecules to 
said solid substrate. 

15 55. An array of claim 48, wherein each distinct physical location on the substrate contains 

multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
location have the same sequence, and each distinct physical location on the substrate contains 
jmcleotide molecules having a sequence which differs from the sequence of nucleotide molecules at* 
another distinct physical location on the substrate. 

20 

56. A polypeptide of claim 1 , comprising the anuno acid sequence of SEQ ID NO: 1 . 

57. A polypeptide of claim 1, comprising the annuo acid sequence of SEQ ID NO:2. 
25 58. A polypeptide of claim 1 , conprising fho an^no acid sequence of SEQ ID NO:3. 

59. A polypeptide of claim 1 , comprising the amino add sequence of SEQ ID NO:4. 

60. A polypeptide of claim 1 , comprising the annno add sequence of SEQ ID NO:5. 

61. A polypeptide of claim 1, comprising the amino add sequence of SEQ ED NO:6. 

62. A polypeptide of claim 1, conqprising the airano add sequence of SEQ ID NO:7. 
35 63. A polypeptide of claim 1, comprisiog the aidno acid sequence of SEQ ID NO:8. 
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64. A polypqitide of claim 1, comprising lliBaimno acid seq^^ 

65. A polypeptide of daim 1 , comprising tJie aniiiio acid sequence of SEQ ID NO:10. 

66. A polypeptide of claim 1 , con?>riBing the amino acid sequence of SEQ ID NO.l 1. 

67. A polypeptide of claim 1, comprising flie anmno add sequence of SEQ ID NO:12. 

68. A polypeptide of daim 1, comprising iheanmio acid seqorajce of SEQ ID NO:13. 

69. A polypeptide of claim 1 , conqnising the anmw add sequence of SEQ ID NO:14. 

70. Apolypeptideofdaiml.compriMngthBannnoaddsequraiceof SEQIDNChlS. 

15 71. A pdyimdcotide of claim 12, convrisingfliepolyimdeotide sequence of SEQ ID 

NO:16. 

72. A polynndeotide of claim 12, convrisingtiiBpolynudeotide sequence of S^ 

NO:17. 

20 

73. A polynudeotide of claim 12, comprising the polynndeotide sequaice of SEQ ID 

NO:18. 

74. A polynudeotide of daim 12, conq>rising the polynucleotide sequence of SEQ ID 

25 NO:19. 

75. A polynudeotide of daim 12, coiqprisingttie polynucleotide sequCTce of SEQ ID 

NOi20. 

30 76. A polynucleotide of claim 12, con^risiiig the polynucleotide sequence of SEQ ID 

NO:21. 

77. A polynudeotide of claim 12, comprising the polynndeotide sequence of SEQ ID 

NO:22. 

35 

78. A polynucleotide of claim 12, comprising flie polynudeotide sequence of SEQ ID 
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NO:23. 

. 79. A polynucleotide of claim 12, conqiiising tihe polyaucleotide seqaence of SEQ ID 
NO:24. 

80. A polynucleotide of dahn 12, conq>risiiig the polynudeotide seqpience of SEQ ID 

NO:25. 

81. A polynucleotide of claim 12, con^rising the polynucleotide sequence of SEQ ID 

N026. 

82. A polynucleotide of claim 12, conqprising the polynucleotide sequoice of SEQ ID 
NO:27. t 

83. A polynucleotide of claim 12, conqyrising the polynucleotide sequence of SEQ ID 

NO:28. 

84. A polynucleotide of clain^ 12, conqirising the polynucleotide sequence of SEQ ID 

NO:29. 

85. A polynucleotide of claim 12, con^rising the polynucleotide sequence of SEQ ID 

NO:30. 
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ABSTRACT OF THE DISCLOSURE 

Various embodimBOts of the invCTdonproidde human kinases and pbospihatases (KPP) and 
polynucleotides which identify and mcode KPP. Enibo^naents of ibe invention also provide 
5 eacpression vectors, host cells, antibodies, agoiusts, and antagonists. Oth^ embodiments provide 
melhods for diagnosing, treating, or preventing disorders associated with aberrant expression of KPP. 
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<110> CHAWLA, Narinder K. 
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WILSON, Amy D. 
JIN, Pel 

<120> KINASES AND PHOSPHATASES 

<130> PF-1724 P 

<140> To Be Assigned 
<141> Herewith 

<160> 30 

<170> PERL Program 

<210> 1 

<211> 157 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 7526185CD1 



<400> 1 












Met 


Ala 


His Ser 


Pro Val Gin Ser Gly Leu 


Pro Gly Met Gin Asn 


1 






5 




10 


15 


Leu 


Lys 


Ala Asp 


Pro Glu Glu 


Leu Phe 


Thr 


Lys Leu Glu Lys He 




20 




25 


30 


Gly 


Lys 


Qly Ser 


Phe Gly Glu Val Phe 


Lys 


Gly He Asp Asn Arg 






35 




40 


45 


Thr 


Gin 


Lys Val 


Val Ala He 


Lys He 


He 


Asp Leu Glu Glu Ala 






50 




55 


60 


Qlu 


Asp 


Qlu ^Ile 


Glu Asp He 


Gin Gin 


Glu 


He Thr Val Leu Ser 




65 




70 


75 


Oln 


Cys 


Asp Ser 


Pro Tyr Val 


Thr Lys 


Tyr 


Tyr Gly Ser Tyr Leu 








80 




85 


90 


Lys 


Asp 


Thr Lys 


Leu Trp He 


He Met 


Glu 


Tyr Leu Gly Gly Gly 








95 




100 


105 


Ser 


Ala 


Leu Asp 


Leu Leu Glu 


Pro Qly Pro 


Leu Asp Glu Thr Gin 








110 




115 


120 


lie 


Ala 


Thr He 


Leu Arg Glu 


He Leu 


Lys 


Qly Leu Asp Tyr Leu 








125 




130 


135 


His 


Ser 


Glu Lys 


Lys He His 


Arg Asp 


He 


Lys Gly Arg His Leu 








140 


145 


150 


Val 


Pro 


Gly His 


Asn Ser Tyr 














155 









<210> 2 
<211> 305 
<2X2> PRT 

-<213> Homo sapiens ' 
<220> 

<221>'misc_£eature 

<223> Incyte XD No: 7526192CD1 

<400> 2 

Met Asp Phe Asp Lys Lys Gly Gly Lys Gly Glu Thr Glu Qlu Gly 
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15 10 15 

Arg Arg Met Ser Lys Ala Gly Gly Gly Arg Ser Ser His Gly lie 
20 25 30 

Arg Ser Ser Gly 'rtu: Ser Ser Gly Val Leu Met Val Gly Pro Asn 
35 40 45 

Phe Arg Val Gly Lys Lys lie Gly Cys Gly Asn Phe Gly Glu Leu 
50 55 60 

Arg Leu Gly Lys Asn Leu Tyr Thr Asn Glu Tyr Val Ala lie Lys 
65 70 75 

Leu Val Ser Arg Pro Leu His Pro Thr Pro Ala Asp Val Pro Pro 
80 85 90 

Arg Asp Phe Arg Ala Ala Thr Arg Ser Pro Gly Asp Ser Leu Leu 
95 100 105 

Cys Pro Gin Glu Pro He Lys Ser Arg Ala Pro Gin Leu His Leu 

110 115 120 

Glu Tyx Arg Phe Tyx Lys Gin Leu Ser Ala Thr Glu Gly Val Pro 

125 130 135 

Gin Val Tyr: Tyr Phe Gly Pro cys Gly Lys Tyr Asn Ala Met Val 

140 145 150 

Leu Glu Leu Leu Gly Pro He Leu Glu Asp Leu Phe Asp Leu Cys 

155 160 165 

Asp Arg Thr Phe Thr Leu Thr Thr Val Leu Met Xle Ala XI e Gin 

170 175 180 

Leu lie Thr Arg Met Glu Tyr Val His Thr Lys Ser Leu He Tyr 

185 190 195 

Arg Asp Val Lys Pro Glu Asn Phe Leu Val Gly Arg Pro Gly Thr 

200 205 210 

Lys Arg Gin His Ala He His He He Asp Phe Gly Leu Ala Lys 

215 220 225 

Glu Tyr He Asp Pro Glu Thr Lys Lys His He Pro Tyr Arg Glu 

230 235 240 

His Lys Ser Leu Thr Gly Thr Ala Arg Tyr Met Ser He Asn Thr 

245 250 255 

His Leu Gly Lys Glu Gin Ser Atg Arg Asp Asp Leu Qlu Ala Leu 

260 265 270 

Gly His Met Phe Met Tyr Phe Leu Arg Gly Ser Leu Pro Trp Gin 

275 280 285 

Gly Leu Lys Val Gly Glu Glu Ala Gly Gin Ala Gly Gly Asp Ala 

290 295 300 

Gly Arg Glu Gin Gly 

305 

<210> 3 

<211> 930 

<212> PRT 

<213> Homo sapiens 

. <220> . - 

<221> inisc_feature 
<223> Xncyte ID No: 7526193CD1 

<400> 3 

Met Lys Lys Phe Phe Asp Ser Arg Arg Glu Gin Gly Gly Ser Gly 
15 10 15 

Leu Gly Ser Gly Ser Ser Gly Gly Gly Gly Ser Thr Ser Gly Leu 

20 25 30 

Gly Ser Gly Tyr Xle Gly Arg Val Phe Gly Xle Gly Arg Gin Gin 

35 40 45 

Val Thr Val Asp Glu Val Leu Ala Glu Gly Gly Phe Ala Xle Val 
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50 55 60 

Phe Leu Val Arg Thr Ser Asn Gly Met Lys Cys Ala Leu Lys Arg 

65 70 75 

Met Phe Val Asn Asn Glu His Asp Leu Gin Val Cys Lys* Arg Glu 

80 85 ^ 90 

lie Gin lie Met Arg Asp Leu Ser Qly His Lys Asn lie Val Gly 

95 100 105 

Tyr lie Asp Ser Ser lie Asn Asn Val Ser Ser Gly Asp Val Trp 

110 115 120 

Glu Val Leu lie Leu Met Asp Phe Cys Arg Gly Gly Gin Val Val 

125 130 135 

Asn Leu Met Asn Gin Arg Leu Gin Thr Gly Phe Thr Glu Asn Glu 

140 145 150 

Val Leu Gin lie Phe Cys Asp rChr Cys Glu Ala Val Ala Arg Leu 

155 160 165 

His Gin Cys Lys Thr Pro lie Xle His Arg Asp Leu Lys Val Glu 

170 175 180 

Asn lie Leu Leu His Asp Arg Gly His Tyr Val Leu Cys Asp Phe 

185 190 195 

Gly Ser Ala Thr Asn Lys Phe Gin Asn Pro Gin Thr Glu Gly Val 

200 205 210 

Asn Ala Val Glu Asp Glu lie Lys Lys Tyr Thr Thr Leu Ser Tyx 

215 220 225 

Arg Ala Pro Glu Met Val Asn Leu Tyr Ser Gly Lys Xle He Thr 

230 235 240 

Thr Lys Ala Asp He Trp Ala Leu Gly Cys Leu Leu Tyr Lys Leu 

245 250 255 

Cys Tyr Phe Thr Leu Pro Phe Gly Glu Ser Gin Val Ala lie Cys 

260 265 270 

Asp Gly Asn Phe Thr Xle Pro Asp Asn Ser Arg Tyr Ser Gin Asp 

275 280 285 

M^t His Cys Leu Xle Arg Tyr Met Leu Glu Pro Asp Pro Asp Lys 

290 295 300 

Arg Pro Asp Xle Tyr Gin Val Ser Tyr Phe Ser Phe Lys Leu Leu 

305 310 315 

Lys Lys Glu Cys Pro Xle Pro Asn Val Gin Asn Ser Pro Xle Pro 

320 325 330 

Ala Lys Leu Pro Glu Pro Val Lys Ala Ser Glu Ala Ala Ala Lys 

335 340 345 

Lys Thr Gin Pro Lys Ala Arg Leu Thr Asp Pro lie Pro Thr Tlir 

350 355 360 

Glu Thr Ser lie Ala Pro Arg Gin Arg Pro Lys Ala Gly Gin Thr 

365 370 375 

Gin Pro Asn Pro Gly lie Leu Pro Xle Gin Pro Ala Leu Thr Pro 

380 385 390 

Arg Lys Arg Ala Thr Val Gin Pro Pro Pro Gin Ala Ala Gly Ser 

395 400 405 

Ser Asn Gin Pro Gly Leu Leu Ala Ser. Val Pro Gin Pro Lys Pro 

410 415 420 

Gin Ala Pro Pro Ser Gin Pro Leu Pro Gin Thr Gin Ala Lys Gin 

425 430 435 

Pro Gin Ala Pro Pro Thr Pro Gin Gin Thr Pro Ser Thr Gin Ala 

440 445 450 

Gin Gly Leu Pro Ala Gin Ala Gin Ala Thr Pro Gin His Gin Gin 

455 * 460 465 

Gin Leu Phe Leu Lys Gin Gin Gin Gin Gin Gin Gin Pro Pro Pro 

470 475 480 

Ala Gin Gin Gin Pro Ala Gly Thr Phe Tyr Gin Gin Qln Gin Ala 

485 490 495 
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Gin Thr Gin 
Ala lie Ala 
Gin Leu Met 
Gin Gin Gin 
Leu Met Thr 
Ala Gly Gin 
Ala Pro Ala 
Gin Pro Lys 
Lys Val Gly 
Ala Gly His 
Phe Gly Val 
Ala Ala Glu 
Pro Ser Gly 
Ser Glu Gly 
Lys Leu Thr 
Gly Glu Gly 
Leu Xle Pro 
Thr Ser Phe 
Ser Pro Asp 
Asp Pro Phe 
Val Ala Val 
Gin Arg Leu 
Asp Ser Leu 
Ser Asn Pro 
lie Ser Ala 
Ser Gly Phe 
Glu Phe Asp 
Gly His Ser 
Asn Leu Ala 



Gin Phe Gin 

500 
Gin Phe Pro 

515 
Gin Asn Phe 

530 
Gin Gin Gin 

545 
Gin Gin Ala 

560 
Gin Pro Gin 

575 
Gin Glu Pro 

590 
Val Gin Thr 

* 605 
Ser Leu Thr 

620 
Arg Arg Xle 

635 
Pro Ala Ser 

650 
Ala Ser Leu 

665 
Ser Pro Arg 

680 
Ser Thr Trp 

695 
Ala Glu Glu 

710 
Lys His Pro 

725 
Gly Phe Gin 

740 
Ser Ala Gly 

755 
Thr Ser Leu 

770 
Gly Ser Thr 

785 
Glu Ser Leu 

800 
Pro Ser Gin 

815 
Thr Gly Glu 

830 
Thr Thr Asp 

845 
Pro Val His 

860 
Asp Val Pro 

875 
Pro Xle Pro 

890 
Arg Asn Ser 

905 
Arg Ser Leu 

920 



Ala Val His 
Val Val Ser 
Tyr Gin Gin 
Leu Ala Hir 
Ala Leu Gin 
Pro Gin Pro 
Ala Gin Xle 
Tfhr Pro Pro 
Pro Pro Ser 
Leu Ser Asp 
Lys Ser Thr 
Asn Lys Ser 
Thr Ser Gin 
Asn Pro Phe 
Leu Leu Asn 
Glu Lys Leu 
Ser Thr Gin 
Thr Glu Lys 
Leu Leu Pro 
Ser Asp Ala 
lie Pro Gly 
Thr Glu Ser 
Asp Ser Leu 
Leu Leu Glu 
Lys Ala Ala 
Glu Gly Ser 
Val Leu Xle 
Ser Gly Ser 
Leu Leu Val 



Pro 


Ala 


Thr 


505 






Gin 


Gly 


Gly 


520 






Gin 


Gin 


Gin 


535 






Ala 


Leu 


His 


550 






Gin 


Lys 


Pro 


565 






Ala 


Ala 


Ala 


580 






Gin 


Ala 


Pro 


595 






Pro 


Ala 


Val 


610 






Ser 


Pro 


Lys 


625 






Val 


Thr 


His 


640 






Gin 


Leu 


Leu 


655 






Lys 


Ser 


Ala 


670 






Gin 


Asn 


Val 


685 






Asp 


Asp 


Asp 


700 






Lys 


Asp 


Phe 


715 






Gly 


Gly 


Ser 


730 






Gly 


Asp 


Ala 


745 






Leu 


Xle 


Glu 


760 






Asp 


Leu 


Leu 


775 






Val 


lie 


Glu 


790 




Leu 


Glu 


Pro 


805 






Val 


Thr 


Ser 


820 






Leu 


Asp 


Cys 


835 






Glu 


Phe 


Ala 


850 






Glu 


Asp 


Ser 


865 






Asp 


Lys 


Val 


880 






Thr 


Lys 


Asn 


895 






Ser 


Glu 


Ser 


910 






Asp 


Gin 


Leu 


925 







Gin 


Gin 


Pro 






510 


Ser 


Gin 


Gin 






525 


Gin 


Gin 


Gin 






540 


Gin 


Gin- 


Gin 






555 


Thr 


Met 


Ala 






570 


Pro 


Gin 


Pro 






585 


Val 


Arg 


Gin 






600 


Gin 


Gly 


Gin 






615 


Thr 


Gin 


Arg 






630 


Ser 


Ala 


Val 






645 


Gin 


Ala 


Ala 






660 


Thr 


Thr 


Thr 






675 


Tyr 


Asn 


Pro 






690 


Asn 


Phe 


Ser 






705 


Ala 


Lys 


Leu 






720 


Ala 


Glu 


Ser 






735 


Phe 


Ala 


Thr 






750 


Gly 


Leu 


Lys 






765 


Pro 


Met 


Thr 






780 


Lys 


Ala 


Asp 






795 


Pro 


Val 


Pro 






810 


Asn 


Arg 


Thr 






825 


Ser 


Leu 


Leu 






840 


Pro 


Ihx 


Ala 






855 


Asn 


Leu 


Xle 






870 


Ala 


Glu 


Asp 






885 


Pro 


Gin 


Gly 






900 


Ser 


Leu 


Pro 






915 


Xle 


Asp 


Leu 






930 



> 
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<210> 4 

<211> 118 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_feat\ire 

<223> Incyte 10 No: 7526196CD1 

<400> 4 

Met Ser Leu Leu Gin Ser Ala Leu Asp Phe Leu Ala Gly Pro Gly 
15 10 15 

Ser Leu Gly Gly Ala Ser Gly Arg Asp Gin Ser Asp Phe Val Gly 

20 25 30 

Gin Olir Val Glu Leu Gly Glu Leu Arg Leu Arg Val Arg Arg Val 

35 40 45 

Leu Ala Glu Gly Gly Phe Ala Phe Val Ty^ Glu Ala Gin Asp Val 

50 55 60 

Gly Ser Gly Arg Glu Tyr Ala Leu Lys Arg Leu Leu Ser Asn Glu 

65 70 75 

Glu Glu Lys Asn Arg Ala lie lie Gin Glu Val Cys Phe Met Leu 

80 85 90 

Cys Ser Leu Gly Glu Pro Ala Gly Cys Leu Ser Val Gly Ser Gly 

95 100 105 

Gly His Ser His Ala Ser Ala Ser Leu Arg Thr Ala Pro 
110 115 

<210> 5 

<211> 1355 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mise_f eature 

<223> Incyte ID No: 7526198CD1 

<400> 5 

Met Ser Leu Leu Gin Ser Ala Leu Asp Phe Leu Ala Gly Pro Gly 
15 10 15 

Ser Leu Gly Gly Ala Ser Gly Arg Asp Gin Ser Asp Phe Val Gly 
20 25 30 

Gin Thr Val Glu Leu Gly Glu Leu Arg Leu Arg Val Arg Arg Val 
35 40 45 

Leu Ala Glu Gly Gly Phe Ala Phe Val Tyr Glu Ala Gin Asp Val 
50 55 60 

Gly Ser Gly Arg Glu Tyr Ala Leu Lys Arg Leu Leu Ser Asn Glu 
65 70 75 

Glu Glu Lys Asn Arg Ala lie He. Gin Glu Val Cys Phe Met Lys 
80 85 90 

Lys Leu Ser Gly His Pro Asn He Val Gin Phe Cys Ser Ala Ala 
' 95 100 105 

Ser He Gly Lys Glu Glu Ser Asp Thr Gly Gin Ala Glu Phe Leu 
110 115 120 

Leu Leu Thr Glu Leu Cys Lys Gly Gin Leu Val Glu Phe Leu Lys 
125 130 135 

Lys Met Glu Ser Arg Gly Pro Leu Ser Cys Asp Thr Val Leu Lys 
140 145 150 

He Phe Tyr Gin Thr Cys Arg Ala Val Gin His Met His Arg Gin 
155 160 165 
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Lys Pro Pro He He His Arg Asp Leu Lys Val Glu Asn Leu Leu 
170 175 180 

Leu Ser Asn Gin Gly Thr He "Lys Leu Cys Asp Phe Gly Ser Ala 
185 190 195 

Thr Thr He Ser His Tyr Pro Asp Tyr Ser Trp Ser Ala Gin Arg 
200 205 210 

Arg Ala Leu Val Glu Glu Glu He Thr Arg Asn Thr Thr Pro Met 
215 220 225 

Tyr Arg Thr Pro Glu He He Asp Leu Tyr Ser Asn Phe Pro He 
230 235 240 

Gly Glu Lys Gin Asp He Trp Ala Leu Gly Cys He Leu Tyr Leu 
245 250 255 

Leu Cys Phe Arg Gin His Pro Phe Glu Asp Gly Ala Lys Leu Arg 
260 265 270 

He Val Asn Gly Lys Tyr Ser He Pro Pro His Asp Thr Gin Tyr 
275 280 285 

Thr Val Phe His Ser Leu He Arg Ala Met Leu Gin Val Asn Pro 
290 295 300 

Glu Glu Arg Leu Ser He Ala Glu Val Val His Gin Leu Gin Glu 
305 310 315 

He Ala Ala Ala Arg Asn Val Asn Pro Lys Ser Pro He Thr Glu 
320 325 330 

Leu Leu Glu Gin Asn Gly Gly Tyr Gly Ser Ala Thr Leu Ser Arg 
335 340 345 

Gly Pro Pro Pro Pro Val Gly Pro Ala Gly Ser Gly Tyr Ser Gly 
350 355 360 

Gly Leu Ala Leu Ala Glu Tyr Asp Gin Pro Tyr Gly Gly Phe Leu 
365 370 375 

Asp He Leu Arg Gly Gly Thr Glu Arg Leu Phe Thr Asn Leu Lys 
380 385 390 

Asp Thr Ser Ser Lys Val He Gin Ser Val Ala Asn Tyr Ala Lys 
395 400 405 

Gly Asp Leu Asp He Ser Tyr He Thr Ser Arg He Ala Val Met 
410 415 420 

Ser Phe Pro Ala Glu Gly Val Glu Ser Ala Leu Lys Asn Asn Xle 
425 ' 430 435 

Glu Asp Val Arg Leu Phe Leu Asp Ser Lys His Pro Gly His Tyr 
440 445 450 

Ala Val Tyr Asn Leu Ser Pro Arg Thr Tyr Arg Pro Ser Arg Phe 
455 460 465 

His Asn Arg Val Ser Glu Cys Gly Trp Ala Ala Arg Arg Ala Pro 
470 475 480 

His Leu His Thr Leu Tyr Asn He Cys Arg Asn Met His Ala Trp 
485 490 495 

Leu Arg Gin Asp His Lys Asn Val Cys Val Val His Cys Met Asp 
500 505 • 510 

Gly Arg Ala Ala Ser Ala Val Ala Val Cys Ser Phe Leu Cys Phe 
515 520 . . , . 525. 

Cys Arg Leu Phe Ser Thr Ala Glu Ala Ala Vai Tyr Met Phe Ser 
530 535 540 

Met Lys Arg Cys Pro Pro Gly He Trp Pro Ser His Lys Arg Tyr 
545 550 555 

He Glu Tyr Met Cys Asp Met Val Ala Glu Glu Pro He Thr Pro 
560 565 570 

His Ser Lys Pro He Leu Val Arg Ala Val Val Met Thr Pro Val 
575 580 585 

Pro Leu Phe Ser Lys Gin Arg Ser Gly Cys Arg Pro Phe Cys Glu 
590 595 600 

Val Tyr Val Gly Asp Glu Arg Val Ala Ser Thr Ser Gin Glu Tyir 
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605 610 615 

Asp Lys Met Arg Asp Phe Lys lie Glu Asp Gly Tie Ala Val lie 

620 625 630 

Pro Leu Gly Val Thr Val Oln Gly Asp Val Leu He Val He Tyr 

635 640 645 

His Ala Arg Ser Thr Leu Gly Gly Arg Leu Gin Ala Lys Met Ala 

650 655 660 

Ser Met Lys Met Phe Gin He Gin Phe His Thr Gly Phe Val Pro 

665 670 675 

Arg Asn Ala Thr Thr Val Lys Phe Ala Lys Tyr Asp Leu Asp Ala 

680 685 690 

Cys Asp He Gin Glu Lys Tyi: Pro Asp Leu Phe Gin Val Asn Leu 

695 • 700 705 

Glu Val Glu Val Glu Pro Arg Asp Arg Pro Ser Arg Glu Ala Pro 

710 715 720 

Pro Trp Glu Asn Ser Ser Met Arg Gly Leu Asn Pro Lys He Leu 

725 730 735 

Phe Ser Ser Arg Glu Glu Gin Gin Asp Xle Leu Ser Lys Phe Gly 

740 745 750 

Lys Pro Glu Leu Pro Arg Gin Pro Gly Ser Thr Ala Gin Tyr Asp 

755 760 765 

Ala Gly Ala Gly Ser Pro Glu Ala Glu Pro Thr Asp Ser Asp Ser 

770 775 780 

Pro Pro Ser Ser Ser* Ala Asp Ala Ser Arg Phe Leu His Thr Leu 

785 790 795 

Asp Txp Gin Glu Glu Lys Glu Ala Glu Thr Gly Ala Glu Asn Ala 

800 805 810 

Ser Ser Lys Glu Ser Glu Ser Ala Leu Met Glu Asp Arg Asp Glu 

815 820 825 

Ser Glu Val Ser Asp Glu Gly Gly Ser Pro He Ser Ser Glu Qly 

830 835 840 

Gin Glu Pro Arg Ala Asp Pro Glu Pro Pro Gly Leu Ala Ala Gly 

845 850 855 

Leu Val Gin Gin Asp Leu Val Phe Glu Val Glu Thr Pro Ala Val 

860 865 870 

Leu Pro Glu Pro Val Pro Gin Glu Asp Gly Val Asp Leu Leu Gly 

875 880 885 

Leu His Ser Glu Val Gly Ala Gly Pro Ala Val Pro Pro Gin Ala 

890 895 900 

Cys Lys Ala Pro Ser Ser Asn Thr Asp Leu Leu Ser Cys Leu Leu 

905 910 915 

Gly Pro Pro Glu Ala Ala Ser Gin Gly Pro Pro Glu Asp Leu Leu 

920 925 930 

Ser Glu Asp Pro Leu Leu Leu Ala Ser Pro Ala Pro Pro Leu Ser 

935 940 945 

Val Gin Ser Thr Pro Arg Gly Gly Pro Pro Ala Ala Ala Asp Pro 

950 955 960 

Phe Qly Pro Leu Leu Pro Ser Ser Gly Asn Asn Ser Gin Pro Cys 

965 970 975" 

Ser Asn Pro Asp Leu Phe Gly Glu Phe Leu Asn Ser Asp Ser Val 

980 985 990 

Thr Val Pro Pro Ser Phe Pro Ser Ala His Ser Ala Pro Pro Pro 

995 1000 1005 

Ser Cys Ser Ala Asp Phe Leu His Leu Gly Asp Leu Pro Gly Glu 
1010 1015 1020 

Pro Ser Lys Met Thr Ala Ser Ser Ser Asn Pro Asp Leu Leu Gly 
1025 1030 1035 

Gly Trp Ala Ala Trp Thr Glu Thr Ala Ala Ser Ala Val Ala Pro 
1040 1045 1050 
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Thr Pro Ala Thr Glu Gly Pro Leu Phe Ser Pro Oly Gly Pro 

1055 1060 
Ala Pro cys Oly Ser Gin Ala Ser Trp Thr Lys Ser Gin Asn Pro 
1070 1075 xvou 

Asp Pro Phe Ala Asp Leu Gly Asp Leu Ser Ser Gly Leu Olx^ Jsp 
1085 1090 J.vi:*3 

Pro Gin Ala Gin Ser Thr Val Ser Pro Arg Gly Gin Arg Val Cys 

1100 1105 
Thr cys Ser Arg Arg Leu Pro Thr Oly^l.ys Leu Lys Pro Val 

1115 1120 
Ala Asp Thr Gly Thr Ala Ala Ser Pro His Arg His Qys Ser 

1130 1135 
Pro Ala Gly Phe Pro Pro Gly Gly Phe lie Pro Lys Thr Ala Thr 

1145 1150 
Thr Pro Lys Gly Ser Ser Ser Trp Gin Thr Ser Arg Pro Pro Ala 
1160 1165 Xi/u 

Gin Gly Ala Ser Trp Pro Pro Gin Ala Lys Pro Pro Pro Ala 
1175 1180 i.xo3 

qys Thr Gin Pro Arg Pro Asn Tyr Ala Ser Asn Phe Ser Val lie 
1190 1195 xzuu 

Gly Ala Arg Glu Glu Arg Gly Val Arg Ala Pro Ser Phe Ala Gin 

1205 1210 
Lys Pro Lys Val Ser Glu Asn Asp Phe Glu Asp Leu Leu Ser Asn 
1220 1225 1230 

Gin Gly Phe Ser Ser Arg Ser Asp Lys Lys Gly Pro Lys Thr lie 

1235 12*0 
Ala Glu Met Arg Lys Gin Asp Leu Ala Lys Asp Thr Asp Pro Leu 
1250 1255 l^ou 

Lys Leu Lys Leu Leu Asp Trp lie Olu Gly Lys Olu Arg Asn lie 

1265 1270 
Arg Ala Leu Leu Ser tbr Leu His Thr Val Leu Trp Asp Gly 

1280 1285 
ser Arg Trp Thr Pro Val Oly Met Ala^Asp Leu Val Ala Pro^Qj^ 

Gin val Lys Lys^His Tyr Arg Arg Ala Val Leu Ala Val His Pro 
1310 1315 iJ^u 

Asp Lys Ala Ala Gly Gin Pro Tyr Glu Gin His Ala Lys Met lie 
1325 1330 

Phe Met Glu Leu Asn Asp Ala Trp Ser Olu Phe Glu Asn 

1340 1345 J-J:>u 

Ser Arg Pro Leu Phe 
1355 

<210> 6 

<211> 490 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> iaisc_feature 

<223> Incyte ID No: 7526208CD1 

Sit^Ala ser llu: Thr Thr Cys Thr Arg Phe Thr Asp Glu Tyr Gin 

1 5 10 

Leu Phe Qlu Glu Leu Gly Lys Gly Ala Phe Ser Val Val Arg Arg 

20 25 
Cys Met Lys He Pro Thr Gly Gin Glu Tyr Ala Ala Lys He lie 

35 40 
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Asn oair Lys Lys Leu Ser Ala Arg Val Arg Leu His Asp Ser lie 

50 55 60 

Ser Glu Glu Gly Phe His Tyx Leu Val Phe Asp Leu Val Thr Gly 

65 70 75 

Gly Glu Leu Phe Glu Asp lie Val Ala Arg Glu Ty^ Tyr Ser Glu 

80 85 90 

Ala Asp Ala Ser His Cys lie Gin Gin lie Leu Glu Ala Val Leu 

95 100 105 

His Cys His Gin Met Gly Val Val His Arg Asp Leu Lys Pro Glu 
110 115 120 

Asn Leu Leu Leu Ala Ser Lys Ser Lys Gly Ala Ala Val Lys Leu 
125 130 135 

Ala Asp Phe Gly Leu Ala lie Glu Val Gin Gly Asp' Gin Gin Ala 

140 145 150 

Trp Phe Gly Phe Ala Gly Thr Pro Gly Tyr Leu Ser Pro Glu Val 

155 160 165 

Leu Arg Lys Asp Pro Tyr Gly Lys Pro Val Asp Met Trp Ala Cys 

170 175 180 

Gly Val lie Leu Tyr lie Leu Leu Val Gly Tyr Pro Pro Phe Trp 

185 190 195 

Asp Glu Asp Gin His Arg Leu Tyr Gin Gin lie Lys Ala Gly Ala 

200 205 210 

Tyr Asp Phe Pro Ser Pro Glu Trp Asp Thr Val Thr Pro Glu Ala 

215 220 225 

Lys Asp Leu lie Asn Lys Met Leu Thr He Asn Pro Ala Lys Arg 

230 235 240 

He Thr Ala Ser Glu Ala Leu Lys His Pro Trp He Cys Gin Arg 

245 250 255 

Ser Thr Val Ala Ser Met Met His Arg Gin Glu Thr Val Asp Cys 

260 265 270 

Leu Lys Lys Phe Asn Ala Arg Arg Lys Leu Lys Gly Ala He Leu 

275 280 285 

Thr Thr Met Leu Ala Thr Arg Asn Phe Ser Ala Ala Lys Ser Leu 

290 295 300 

Leu Lys Lys Pro Asp Gly Val Lys Lys Arg Lys Ser Ser Ser Ser 

305 310 315 

Val Gin Met Met Glu Ser Thr Glu Ser Ser Asn Thr Thr He Glu 

320 ! !325 330 

Asp Glu Asp Val Glu Ala Arg Lys Gin Glu He He Lys Val Thr 

335 340 345 

Glu Gin Leu He Glu Ala He Asn Asn Gly Asp Phe Glu Ala Tyr 

350 I 355 360 

Thr Lys He Cys Asp Pro Gly Leu Thr Ala Phe Glu Pro Glu Ala 

365 370 375 

Leu Gly Asn Leu Val Glu Gly Met Asp Phe His Arg Phe Tyr Phe . 

380 385 390 

Glu Asn Ala Leu Ser Lys Ser Asn Lys Pro He His Thr He He 

395 400 405 

I^eu Asn Pro His Val His Leu Val Gly Asp Asp Ala Ala Cys lie 

410 4X5 420 

Ala Tyr He Arg Leu Thr Gin Tyr Met Asp Gly Ser Gly Met Pro 

425 430 435 ' 

Lys Thr Met Gin Ser Glu Glu Thr Arg Val Trp His Arg Arg Asp 

440 445 450 

Gly Lys Trp Gin Asn Val His Phe His Arg Ser Gly Ser Pro Thr 
'455 460 465 

Val Pro He Lys Pro Pro Cys He Pro Asn Gly Lys Glu Asn Phe 

470 475 480 

Ser Gly Gly Thr Ser Leu Trp Gin Asn He 
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485 490 

<210> 7 

<211> 344 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> xaisc^f eature 

<223> Incyte ID No: 7526212CD1 

<400> 7 

Met Ala Ser rChr Thr Thr Cys Thr Arg Phe Thr Asp Glu Tyx Gin 
15 10 15 

Leu Phe Glu Glu Leu Gly Lys Gly Ala Phe Ser Val Val Arg Arg 
20 . 25 30 

Cys Met Lys lie Pro Thr Gly Gin Glu Tyr Ala Ala Lys lie He 
35 40 45 

Asn Thr Lys Lys Leu Ser Ala Arg Val Arg Leu His Asp Ser He 
50 55 60 

Ser Glu Glu Gly Phe His Tyr Leu Val Val Asp Leu Val Thr Gly 
65 70 75 

Gly Glu Leu Phe Glu Asp He Val Ala Arg Glu Tyr Tyr Ser Glu 
80 85 90 

Ala Asp Ala Ser His Cys He Gin Gin He Leu Glu Ala Val Leu 
95 100 105 

His Cys His Gin Met Gly Val Val His Arg Asp Leu Lys Pro Glu 
110 115 120 

Asn Leu Leu Leu Ala Ser Lys Ser Lys Gly Ala Ala Val Lys Leu 
125 130 135 

Ala Asp Phe Gly Leu Ala He Glu Val Gin Gly Asp Gin Gin Ala 
140 145 150 

Trp Phe Gly Phe Ala Gly Thr Pro Gly Tyr Leu Ser Pro Glu Val 
155 160 165 

Leu Arg Lys Asp Pro Tyr Gly Lys Pro Val Asp Met Trp Ala Cys 
. 170 175 180 

Gly Val He Leu Tyr He Leu Leu Val Gly Tyr Pro Pro Phe Trp 
185 190 195 

Asp Glu Asp Gin His Arg Leu Tyr Gin Gin He Lys Ala Gly Ala 
i 200 205 210 

Tyr Asp Phe Pro Ser Pro Glu Trp Asp Thr Val rttir Pro Glu Ala 
215 220 225 

Lys Asp Leu He Asn Lys Met I*eu Thr He Asn Pro Ala Lys Arg 
230 235 240 

He rttir Ala Ser Glu Ala Leu Lys His Pro Trp He Cys Gin Arg 
245 250 255 

Ser Thr Val Ala Ser Met Met His Arg Gin Glu Thr Val Asp Cys 

260 . .265 - . .. 

Leu Lys Lys Phe Asn Ala Arg Arg Lys Leu Lys Gly Ala lie Leu 
275 280 285 

Thr Thr Met Leu Ala Thr Arg Asn Phe Ser Ala Ala Lys Ser Leu 
290 295 300 

Leu Lys Lys Pro Asp Gly Val Lys Glu Ser Thr Glu Ser Ser Asn 
305 310 ^ 315 

Thr Thr He Glu Asp Glu Asp Val Lys Gly Thr Val Ala His Ala 
320 325 330 

Cys Asn Pro Ser Thr Leu Gly Gly Arg Gly Gly Gin He Thr 
335 340 
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<210> 8 
<211> 89 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7526213CD1 

<400> 8 

Met Lys Lys Phe Ser Arg Met Pro Lys Ser Glu Gly Gly Ser Gly 
15 10 15 

Qly Gly Ala Ala Gly Gly Gly Ala Gly Gly Ala Gly Ala Gly Ala 

20 25 30 

Gly Cys Gly Ser Gly Gly Ser Ser Val Gly Val Arg Val Phe Ala 

35 40 45 

Val Gly Arg His Gin Val Thr Leu Glu Glu Ser Leu Ala Glu Val 

50 55 60 

lie Gin Met Leu Pro Val Gin Glu Pro Arg Leu Glu Tyr Arg Val 

65 70 75 

Pro Leu lie Ser Ser Gly Arg Arg Arg Leu Arg Arg Arg Cys 

80 85 



<210> 9 
<211> 88 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 7526214CD1 



<400> 9 






Met 


Lys 


Lys 


Phe Ser Arg Met Pro 


1 






5 


Gly 


Gly 


Ala 


Ala Gly Gly Gly Ala 








20 


Gly 


Cys 


Gly 


Ser Gly Gly Ser Ser 








35 


Val 


Gly 


Arg 


His Gin Val Thr Leu 








50 


Thr 


Gly 


Ala 


Arg Gly Gly Ser Asp 








65 


Phe 


Ser 


Ser 


Cys Val Leu Thr Val 








80 



Lys Ser Glu Gly Qly Ser Gly 

10 15 
Gly Gly Ala Gly Ala Gly Ala 

25 30 
Val Gly Val Arg Val Phe Ala 

40 45 
Glu Glu Ser Leu Ala Glu Gly 

55 60 
Arg Gin Val Asp Ser Pro Gin 

70 75 
Glu Ser Asp Val His 

85 



<210> 10 
<211> 137 
<212> PRT 

<213> Hozoo sapiens 



<220> 

<221> xiiisc_£eature 

<223> Incyte ID No: 7526228CD1 



<400> 10 

Met Ser Thr Ala Ser Ala Ala S^r Ser Ser Ser Ser Ser Ser Ala 
15 10 15 

Gly Glu Met lie Glu Ala Pro Ser Gin Val Leu Asn Phe Glu Glu 
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20 25 30 

He Asp Tyr Lys Glu He Glu Val Glu Glu Val Val Gly Arg Gly 

35 40 45 

Ala Phe Gly Val Val Cys Lys Ala Lys Trp Arg Ala Lys Asp Val 

50 55 60 

Ala He Lys Gin He Glu Ser Glu Ser Glu Arg Lys Ala Phe He 

65 70 75 

Val Glu Leu Arg Gin Leu Ser Arg Val Asn His Pro Asn He Val 

80 85 90 

Lys Leu Tyr Gly Ala Cys Leu Asn Pro Val Cys Leu Val Met Glu 

95 100 105 

Tyr Ala Glu Gly Gly Ser Leu Tyr Asn Val Cys Ala Phe Leu Ser 
110 115 120 

Gin Cys Cys Met Val Leu Asn His Cys His He He Leu Leu Pro 
125 130 135 

Thr Gin 



<210> 11 
<211> 243 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> iELisc_£eature 

<223> Xncyte ID No: 7526246CD1 

<400> 11 

Met Ala Asp Leu Glu Ala Val Leu Ala Asp Val Ser Tyr Leu Met 
15 10 15 

Ala Met Glu Lys Ser Lys Ala Thr Pro Ala Ala Arg Ala Ser Lys 
^ 20 25 30 

Lys He Leu Leu Pro Glu Pro Ser He Arg Ser Val Met Qln Lys 
35 40 45 

Tyr Leu Glu Asp Arg Gly Glu Val Thr Phe Glu Lys He Phe Ser 
50 55 60 

Gin Lys Leu Gly Tyr Leu Leu Phe Arg Asp Phe Cys Leu Asn His 
65 70 75 

Leu Glu Glu Ala Arg Pro Leu Val Glu Phe oyr Glu Glu He Lys 
80 85 90 

Lys Tyr Glu Lys Leu Glu Thr Glu Glu Glu Arg Val Ala Arg Ser 
95 100 105 

Arg Glu He Phe Asp Ser Tyr He Met Lys Glu Leu Leu Ala Cys 

110 115 120 

Ser His Pro Phe Ser Lys Ser Ala Thr Glu His Val Gin Gly His 

125 130 135 

Leu Gly Lys Lys Gin Val Pro Pro Asp Leu Phe Gin Pro Tyr He' 

140 145 . . 150 

Glu Glu He Cys Gin Asn Leu Arg Gly Asp Val Phe Gin Lys Phe 

155 160 165 

He Glu Ser Asp Lys Phe Thr Arg Phe Cys Gin Trp Lys Asn Val 

170 175 180 

Glu Leu Asn He His Val Ser Gly Leu Gly Trp Gly Met Glu Ser 

185 190 195 

His Ala Pro Cys Cys Ser Ser Pro Gly Ser Trp Ala Cys Gly Leu 

200 205 210 

Ala Gly Arg Gly Arg Ser Gly Asp Val Cys Pro Leu Ala Pro Arg 

215 220 225 

Ala Val Ala Met Gly Val Arg Ala Gly He Pro Ala Trp Gly Gly 
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Arg Ser Arg 



230 235 240 



<210> 12 

<211> 463 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 7526258CD1 

<400> 12 

Met Arg Arg Pro Arg Gly Qlu Pro Gly Pro Arg Ala Pro Arg Pro 
15 10 15 

Thr Qlu Gly Ala Thr Cys Ala Gly Pro Gly Glu Ser Trp Ser Pro 

20 25 . 30 

Ser Pro Asn Ser Met Leu Arg Val Leu Leu Ser Ala Gin Thr Ser 

35 40 45 

Pro Ala Arg Leu Ser Gly Leu Leu Leu He Pro Pro Val Gin Pro 
50 55 60 

Cys Cys Leu Gly Pro Ser Lys Trp Gly Asp Arg Pro Val Gly Gly 
65 70 -75 

Gly Pro Ser Ala Gly Pro VeQ Gin Gly Leu Gin Arg Leu Leu Glu 
80 85 90 

Gin Ala Lys Ser Pro Gly Glu Leu Leu Arg Trp Leu Gly Gin Asn 
95 100 105 

Pro Ser Lys Val Arg Ala His His Tyr Ser Val Ala Leu Arg Arg 

110 115 120 

Leu Gly Gin Leu Leu Gly Ser Arg Pro Arg Pro Pro Pro Val Glu 

125 130 135 

Gin Val Thr Leu Gin Asp Leu Ser Gin Leu He He Arg Asn Cys 

140 145 150 

Pro Ser Phe Asp He His Thr He His Val Cys Leu His Leu Ala 

155 160 165 

Val Leu Leu Gly Phe Pro Ser Asp Gly Pro Leu Val Cys Ala Leu 

170 175 180 

Glu Gin Glu Arg Arg Leu Arg Leu Pro Pro Lys Pro Pro Pro Pro 

185 190 195 

Leu Gin Pro Leu Leu Arg Glu Ala Arg Pro Glu Glu Leu Thr Pro 

200 205 210 

His Val Met Val Leu Leu Ala Gin His Leu Ala Arg His Arg Leu 

215 220 225 

Arg Glu Pro Gin Leu Leu Glu Ala He Thr His Phe Leu Val Val 

230 235 240 

Gin Glu Thr Gin Leu Ser Ser Lys Val Val Gin Lys Leu Val Leu 

245 250 255 

Pro Phe Gly Arg Leu Asn Tyr Leu Pro Leu Glu Gin Gin Phe Met 

260 265 270 

Pro Cys Leu Glu Arg He Leu Ala Arg Glu Ala Gly Val Ala Pro 

275 280 285 

Leu Ala Thr Val Asn He Leu Met Ser Leu Cys Gin Leu Arg Cys 

290 295 300 

Leu Pro Phe Arg Ala Leu His Phe Val Phe Ser Pro Gly Phe He 

305 310 315 

Asn Tyr He Ser Gly Thr Pro His Ala Leu He Val Arg Arg Tyr 

320 325 330 

Leu Ser Leu Leu Asp Thr Ala Val Glu Leu Glu Leii Pro Gly Tyr 
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335 




340 


345 


Arg 


Gly 


Pro Arg 


Leu 


Pro 


Arg Arg Gin Gin Val 


Pro He Phe Pro 






350 




355 


360 


Gin 


Pro 


Leu He 


Thr 


Asp 


Arg Ala Arg Cys Lys 


Tyr Ser His Lys 








365 




370 


375 


Asp 


He 


Val Ala 


Glu 


Gly 


Leu Arg Gin Leu Leu 


Gly Glu Glu Lys 








380 




385 


390 


Tyr 


Arg 


Gin Asp 


Leu 


Thr 


Val Pro Pro Gly Tyr 


Cys Thr Gly Glu 








395 




400 


405 


Gin 


Gly 


Ala Gly 


Gly 


Arg 


Pro Gly Glu Thr Glu 


Pro Trp Leu Arg 








410 




415 


420 


Pro 


Pro 


Ala Leu 


Leu 


Pro 


Ser Arg Leu Pro Ala 


Val Arg Gin Gin 








425 




430 


435 


Leu 


Trp 


Cys Cys 


Ala 


Ser 


Arg Glu Asp Pro Gly 


Pro Leu Pro Ala 








440 




^ 445 


450 


He 


Pro 


Thr Lys 


Val 


Leu 


Pro Thr Gly Pro Gly 


cys Leu 








455 




460 





<210> 13 

<211> 184 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^f eature 

<223> Xncyte ID No: 7526311CD1 

<400> 13 



Met 


Arg 


Leu 


Ala 


Arg 


Leu Leu Arg Gly Ala Ala 


Leu Ala Gly Pro 


1 








5 


10 


15 


Gly 


Pro 


Gly 


Leu 


Arg 


Ala Ala Gly Phe Ser Arg 


Ser Phe Ser Ser 










20 


25 


30 


Asp 


Ser 


Gly 


Ser 


Ser 


Pro Ala Ser Glu Arg Gly 


Val Pro Gly Gin 










35 


/ 40 


45 


Val 


Asp 


Phe 


Tyr 


Ala 


Arg Phe Ser Pro Ser Pro 


Leu Ser Met Lys 










50 


55 


60 


Oln 


Phe 


Leu 


Asp 


Phe 


Gly Ser Val Asn Ala Cys 


Glu Lys Thr Ser 










65 


70 


75 


Phe 


Met 


Phe 


Leu 


Arg 


Gin Glu Leu Pro Val Arg 


Leu Ala Asn Xle 










80 


85 


90 


Met 


Lys 


Glu 


He 


Ser 


Leu Leu Pro Asp Asn Leu 


Leu Arg Thr Pro 










95 


100 


105 


Ser 


Val 


Gin 


Leu 


Val 


Gin Ser Trp Tyr He Gin 


Ser Leu Gin Glu 










110 


115 


120 


Leu 


Leu 


Asp 


Phe 


Lys 


Asp Lys Ser Ala Glu Asp 


Ala Lys Ala Xle 










125 


130 


135 


Tyr 


Glu 


Arg 


Pro 


Arg 


Arg Thr Trp Leu Gin Val 


Ser Ser Leu Cys 










140 


145 


150. 


Cys 


Met 


Ala 


Cys 


Lys 


Met He Phe He Val Tzp 


Trp Lys Arg Gin 










155 


160 


165 


Arg 


Lys 


Ser 


He 


Ser 


Ser Lys Thr His Trp Lys 


His Lys Ser Lys 










170 


175 


180 


Leu 


Gin 


Cys 


Thr 









<210> 14 

<211> 386 

<212> PRT 

<213> Homo sapiens 
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<220> 

<221> misc^feature ^ 
<223> Incyte ID No: 7526315CD1 

<400> 14 

Met Ser Ser Leu Oly Ala Ser Phe Val Gin Xle Lys Phe Asp Asp 
15 10 15 

Leu Gin Phe Phe Glu Asn Cys Gly Gly Gly Ser Phe Gly Ser Val 
20 25 30 

Tyr Arg Ala Lys Trp lie Ser Gin Asp Lys Glu Val Ala Val Lys 
35 40 45 

Lys Leu Leu Lys lie Glu Lys Glu Ala Glu lie Leu Ser Val Leu 
50 55 60 

Ser His Arg Asn lie Xle Gin Phe Tyr Gly Val Xle Leu Glu Pro 
65 70 75 

Pro Asn Tyr Gly Xle Val Thr Glu Tyr Ala Ser Leu Gly Ser Leu 
80 85 90 

Tyr Asp Tyr Xle Asn Ser Asn Arg Ser Glu Glu Met Asp Met Asp 
95 100 105 

His Xle Met Thr Trp Ala Thr Asp Val Ala Lys Gly Met His Tyr 

110 115 120 

Leu His Met Glu Ala Pro Val Lys Val Xle His Arg Asp Leu Lys 

125 130 135 

Ser Arg Asn Val Val lie Ala Ala Asp Gly Val Leu Lys Xle Cys 

140 145 150 

Asp Phe Gly Ala Ser Arg Leu His Asn His Thr Thr His Met Ser 

155 160 165 

Leu Val Gly Thr Phe Pro Trp Met Ala Pro Glu Val Xle Gin Ser 

170 175 180 

Leu Pro Val Ser Glu Thr Cys Asp Thr Tyr Ser Tyr Gly Val Val 

185 190 195 

Leu Trp Glu Met Leu Tbx Arg Glu Val Pro Phe Lys Gly Leu Glu 

200 205 210 

Gly Leu Gin Val Ala Txp Leu Val Val Glu Lys Asn Glu Arg Leu 

215 220 225 

Lys Lys Leu Glu Arg Asp Leu Ser Phe Lys Glu Gin Glu Leu Lys 

230 235 240 

Glu Arg Glu Arg Arg Leu Lys Met Trp Glu Gin Lys Leu Thr Glu 

245 250 255 

Gin Ser Asn Thr Pro Leu Leu Leu Pro Leu Val Ala Arg Met Ser 

260 265 270 

Glu Glu Ser Tyr Phe Glu Ser Lys Thr Glu Glu Ser Asn Ser Ala 

275 280 285 

Glu Met Ser Cys Gin lie Thr Ala Thr Ser Asn Gly Glu Gly His 

290 295 300 

Gly Met Asn Pro Ser Leu Gin Ala Met Met Leu Met Gly Phe Gly 

305 310 315 

Asp Xle Phe Ser Met Asn Lys Ala Gly Ala Val Met His Ser Gly 

320 325 330 

Met Gin Xle Asn Met Gin Ala Lys Gin Asn Ser Ser Lys Thr Thr 

335 340 345 

Ser Lys Arg Arg Gly Lys Lys Val Asn Met Ala Leu Gly Phe Ser 

350 355 360 

Asp Phe Asp Leu Ser Glu Gly Asp Asp Asp Asp Asp Asp Asp Gly 

365 370 375 

Glu Glu Glu Asp Asn Asp Met Asp Asn Ser Glu 

380 385 

<210> 15 
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<211> 152 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> xQisc_f eature 

<223> Incyte ID No: 7526442CD1 

<400> 15 

Met Asp Gin Tyr Cys lie Leu Gly Arg He Gly Glu Gly Ala His 
15 10 15 

Gly He Val Phe Lys Ala Lys His Val Glu Thr Gly Glu He Val 

20 25 30 

Ala Leu Lys Lys Val Ala Leu Arg Arg Leu Glu Asp Gly Phe Pro 

35 40 45 

Asn Gin Ala Leu Arg Olu He Lys Ala Leu Gin Glu Met Glu Asp 

50 55 60 

Asn Gin Tyr Val Val Gin Leu Lys Ala Val Plie Pro His Gly Gly 

65 70 75 

Gly Plie Val Leu Ala Phe Glu Phe Met Leu Ser Asp Leu Ala Glu 

80 85 90 

Val Val Arg His Ala Gin Arg Pro Leu Ala Gin Ala Gin Val Lys 

95 100 105 

Ser Tyr Leu Gin Met Leu Leu Lys Qly Val Ala Phe Cys His Ala 
110 115 120 

Asn Asn He Val His Arg Asp Leu Pro Pro Arg Pro He Gin Gly 
125 130 135 

Pro Pro Thr Ser Met Thr Ser Thr Trp Thr Gly Leu Leu Arg Ser 
140 145 150 

Arg Cys 



<210> 16 

<211> 4430 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> iaisc_f eature 

<223> Incyte ID No: 7526185CB1 

<400> 16 

ccggctccag cggccagcgc gcgcgggccc 
gcagcagcgg cggcggcggc agtgcgcgcg 
gcgccgtgca tggagacgcg gcccgccacc 
cccgccaggg ctggggtggc ctcgggctcc 
ggcccgcggg cctcgccgcc ccgcgcggat 
gtggccgtcc tgagcgccat ggctcactcc 
aacctaeiagg cagacccaga agagcttttt 
tttggagagg tgttceiaagg cattgacaat 
attgatctgg aagaagctga agatgagata 
agtcagtgtg acagtccata tgtaaccaaa 
ttatggataa taatggaata tcttggtgga 
Gcattagatg aaacccagat cgctactata 
ctccattcgg agaagaaaat ccacagagac 
aacagctatt gaacttgcaa gaggggaacc 
tttattcctc attccaaaga acaacccacc 
caaggagttt gtggaggcct gtttgaataa 
gttattgaag cacaagttta tactacgcaa 



aggccgcccg gctccagccc agbagtagcg 60 
aggccctgcg cccccagcag ctcctccctg 120 
cgccgctgag cccccgccgc ccggccggga 180 
ggccggcccc gccgcccgag ggctgcgcgc 240 
cgtcgcggcc cggccgtccc gtcccaggaa 300 
ccggtgcagt cgggcctgcc cggcatgcag 3_60_ 
acaaaactag agaaaattgg gaagggctcc 420 
cggactcaga eiagtggttgc cataaagatc 480 
gaggacattc eiacaagaaat cacagtgctg 540 
tattatggat cctatctgaa ggatacaaaa 600 
ggctccgcac tagatctatt agaacctggc 660 
ttaagagaaa tactgaaagg actcgattat 720 
attaaaggca gacatctggt ccctgggcat 780 
acctcattcc gagctgcacc ccatgaeiagt 840 
gacgttggaa ggaaactaca gtaaacccct 900 
ggagccgagc tttagaccca ctgctaagga 960 
tgcaaagaaa acttcctact tgaccgagct 1020 
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catcgacagg tacaagagat ggaaggccga gcagagccat gacgactcga gctccgagga 1080 
ttccgacgcg gaaacagatg gccaagcctc ggggggcagt gattctgggg actggatctt 1140 
cacaatccga gaaaaagatc ccaagaatct cgagaatgga gctcttcagc catcggactt 1200 
ggacagaaat aagatgaaag acatcccaaa gaggcctttc tctcagtgtt tatctacaat 1260 
tatttctcct ctgtttgcag agttgaagga gaagagccag gcgtgcggag ggaacttggg 1320 
gtccattgaa gagctgcgag gggccatcta cctagcggag gaggcgtgcc ctggeatctc 1380 
cgacaccatg gtggcccagc tcgtgcagcg gctccagaga tactctctaa gtggtggagg 1440 
aacttcatcc cactgaaatt cctttggcat ttggggtttt gtttttcctt ttttccttct 1500 
tcatcctcct ccttttttaa aagtcaacga gagccttcgc tgactccacc gaagaggtgc 1560 
gccactggga gccaccccag tgccaggcgc ccgtccaggg acacacacag tcttcactgt 1620 
gctgcagcca gatgaagtct ctcaigatggg tggggagggt cage tec ttc cagcgatcat 1680 
tttattttat tttattactt ttgtttttaa ttttaaccat agtgcacata ttccaggaaa 1740 
gtgtctttaa aaacaaaaac aaaccctgaa atgtatattt gggattatga taaggcaact 1800 
ciaagacatga aacctcaggt atcctgcttt aagttgataa ctccctctgg gagctggaga 1860 
atcgctctgg tggatgggtg tacagatttg tatataatgt catttttacg gaaacccttt 1920 
cggcgtgcat aaggaatcac tgtgtacaaa ctggccaagt gcttctgtag ataacgtcag 1980 
tggagtaaat attcgacagg ccataacttg agtctattgc cttgccttta ttacatgtac 2040 
attttgaatt ctgtgaccag tgatttgggt tttattttgt atttgcaggg tttgtcatta 2100 
ataattaatg cccctctctt acagaacact cctatititgta cctcaacaaa tgcaaatttt 2160 
ccccgtttgc cctacgcccc ttttggtaca cctagaggtt gatttccttt ttcatcgatg 2220 
gtactatttc ttagtgtttt aaattggaac atatcttgcc tcatgaagct ttaaatbata 2280 
attttcagtt tctccccatg aagcgctctc gtctgacatt tgtttggaat cgtgccactg 2340 
ctggtctgcg ccagatgtac cgtcctttcc aatacgattt tctgttgcac cttgtagtgg 2400 
attctgcata tcatctttcc cacctaaaaa tgtctgaatg cttacacaaa taaattttat 2460 
aacacgctta ttttgcatac tccttgaaat gtgactcttc agaggacagg gcacctgctg 2520 
tgtatgtgtg gccgtgcgtg tgtactcgtg gctgtgtgtg tgtgatgaga cactttggaa 2580 
gactccaggg agaagtcccc aggcctggag ctgccgagtg cccaggtcag cigccctggac 2640 
tgcttgcgca cttgctcacc gagatgatgc agttggaggt tgctgatctg tgcgattgct 2700 
gtagcggttg ccggggacct taagagttat tttgcttctc tggaaggggc ctatgcttgc 2760 
taggcaggca gcca.gtgtgt ctgtttttct tggtttgctg tgggaccttg cttggcgagg 2820 
gggaaaatct ctgg'gtittct ggagtgggag ggttcgtgca gcagctgttg actggtacat 2880 
gaagcattct tttatgtttg ttgaagctga tgattgacat ctcccgtggg tgtgccagHt 2940 
cttgtggagt taagacagga tttttggaag caaggaagtt agtgggtgag cttggggatg 3000. 
tagctcagct atctgctggt ctagtggcct ctaagctata gggaggggac agagccctga 3060 
gctacagatg cttgagtggg ttattgtgtc ggtttgctag tgcagtctgg tttttaagct 3120 
ctaaaattga ggtattttat tagaagtgga tttgggttga actcttaatt tgtataaggg 3180 
atatattttg gttggggaaa tagaactgag ttgctaattc ttattgtact cattacbcca 3240 
tacaagaatg ttatgttgaa taataaaatt ggagaagatt tcattttgtg tttccaggga 3300 
gtattctgtg tggggaactg tttccttacg tgaggccggc ggcataagtc aaagatgagt 3360 
tttgtccttg cgaatcacac agattgagtc tgtgttcccc agggtgtgcc gttacctgat 3420 
ttttaagtga gccagggcgg acagcagctt ttctgattta cagagttctt cagatttaca 3480 
aatggacaat gacatcacag tttfctagcac tgaagccagt ctcatgctag taacagtggg 3540 
tgagccgctc gagggactgg gttctaatga atactggtat gaacggggag tctctgcagt 3600 
cgccagacaa atcatactca gccccttccc ccgtagagca acaagtggtt cttttagagt 3660 
tgactggcag catttcctgt cgggggaggt ggggtttgat ggagttagaa agctcgcctc 3720 
tgtgtacatt ctctcctggg ctgttacttt ctgtagacgc acaaaatcag ccccaatgtt 3780 
tttaagggca tcttagccaa ggaagctggc ttttgtgtcg ccacttccag gcctgcatta 3840 
agagagagcc caggcaccag ggctaccact ggeiacctgcc tcagcgtcaa ctgctgctgg 3900 
tctgtagcca ggcccagcct ttgagacggg tttactgtca ccagtagcct ctcagtgcca 3960 
gccctgagct gctcctggct cagctgccca gagcctgcag cctggggagg tactcagccb 4020 
ctgggagacg agggccgtgg actgggtggc tggtagctcc tgcgtttttg agctgtgtcc 4080 
tggctggctg ctgccaatga ggtggacacc agtgtggttt ggggtgcact ggccacbtct 4140 
tgctgggttc tgattttctt ggaagtgcat ctgccttcct tatccaatag ttttatccct 4200 
gcattgctct tgtgaagtgg ctggtfctggt bcbgtabgta gcabbbbgba ccbbbcctcb 4260 
ggcaaaacac tgtcagttba baaacabbbb bbababbbcc cbccbbbaaa aacagcbtgb 4320 
gtatttctgc tataaaabgt gtcagcaaag gcagagtgac ctaatagggc abgbbcbtaa 4380 
gcacagggac tgtatcabgc aggggccaab aaagctcaag aaaacgagta 4430 

<210i- 17 
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<211> 3276 

<212> DNA 

<213> Hcaoo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 7526192CB1 

<400> 17 

caaaacgccg tggccgtcgc gcggcgccat ccgttgtcgc aaagcggcgc gagaaacgcc 60 
cagccgggtg ttggccccgc cccgcgctgt gacgtcggcg gcgcgcgccc ccggcgccgt 120 
ggccgcggct gcgcagtggg gcgcgtatgg ctgccgcccc ccgctggcag acgctggcgg 180 
cgtaaggcgc gcgggccccg gagcgggcgc ggcggagcgc ggcgagcccg gcgccbcccg 240 
tcccgaacat gcggaggccg gcccaggcgg cgcgggagcc ggagcggggg. cccaagcggc 300 
accggagccg gagcgcgagg gggcgcgggg cccggagcgg gggtccgcgc tgcgctgctg 360 
aggccgggcc ggccgcccag acgctgcccg cgggcccggc cacggcggag ccaagctgtg 420 
agccgtgagc tttgaggcgg tgggatgtgt cagcagaatg tctcctgccc ccgagagcga 480 
ccccgaggcc actgagaaga gcagcgcggc ctggccggcc cgaacgcctg cgtctcagta 540 
gctgggagcc acgggcccac gcccgcccac cggccgcagt gatgttctag ccacagagga 600 
gccaagacct caggtttcca gagacttggg atttgcacgg cagcagagtc accgtggaga 660 
ggccagggta tcacaaactt atggattttg acaagaaagg agggaaaggg gagacggagg 720 
agggccggag aatgtccaag gccggcgggg gccggagcag ccacggcatc cggagctcgg 780 
ggaccagctc gggggtcctg atggtgggcc ccaacttccg cgtcggcaag aagatcggct 840 
gcggcaactt cggggagctc cgcctaggaa agaatctcta tacaaatgaa tacgtggcta 900 
tcaaattggt gagtcggccc ctccacccca cccccgctga cgtgcccccc agggatttca 960 
gggcagcgac ccggtcccct ggtgactcgc tcttgtgccc ccaggagccg atcaagtccc 1020 
gggccccgca gctgcacctg gagtaccggt tctacaagca gctcagcgcc acagagggcg 1080 
tccctcaggt ctactacttc ggtccgtgcg ggaagtacaa cgccatggtg ctggagctgc 1140 
tggggcccat cctggaggac ctgttcgacc tgtgcgaccg gaccttcacg ctcacgacgg 1200 
tgetgatgat cgccatccag ctgatcacgc gcatggagta tgtgcacacc aagagcctaa 1260 
tctaccggga cgtgaagccc gagaacttcc tggtgggccg cccggggacc aagcggcagc 1320 
atgccatcca catcatcgac ttcgggctgg cceiaggagta catcgacccc gagaccaaga 1380 
agcacatccc gtaccgcgag cacsiagagcc tgacgggcac ggcgcgctac atgagcatca 1440 
acacgcacct gggcaaggag cagagccgcc gcgacgacct ggaggcgctg ggccacatgt 1500 
tcatgtactt cctgcgcggc agcctcccct ggca^gggct caaggtgggc gaggaggccg 1560 
ggcaggcggg cggggacgca gggcgggagc aaggctgacc acagaccccc gcaggccgac 1620 
acgctcaagg agcggtacca gaagatcggg gacaccaaac gcgccacgcc catcgaggtg 1680 
ctctgcgaga acttcccaga ggagatggcc acgtacctgc gctatgtgcg gcgcctggac 1740 
ttcbtcgaga agcccgacta tgactacctg cggaagctct tcaccgacct cttcgaccgc 1800 
agtggcttcg tgttcgacta tgagtacgac tgggccggga agcccctgcc gacccccatc 1860 
ggcaccgtcc acaccgacct gccctcccag cctcagctcc gggacaaagc ccagccgcac 1920 
agcaaaaacc aggcgttgaa ctccaccaac ggggagctga atgcggacga ccccacggcc 1980 
ggccactcca acgccccgat cacagcgcct gcagaggtgg aggtggccga tgaaaccaaa 2040 
atgctgcacc aaagctcggg cgccgcgggc acggctgctg cagtctcttc ccagcctggc 2100 
cctggcaagg ggcgggtggg cgctgccagg cgggtgcttc tcgacgcact tgctcccgga 2160 
ggctgcgccc cggcgcctgg aacccgaggt gggaggaccg gttggtgtca ccctgctcgg 2220 
ccctcagccc tgccgcgtgg ggcgcgtggg cacggagctt cttgcctctc tgctccgaca 2280 
cccggcaagc agticggagac aaaacgcctt asLagcccccg gcccagccct gcaggbatat 234.0. 
tgcaggggcc tgggggcggc cctggactgg cgggcggttc cccagtgggg tgccctggag 2400 
gctgccgggc agagtggagc agcttggggc cgtgcccagg gcggtggctg tgagtctagt 2460 
ttttgcttta ccaagtgtac agaaatggca tttacgtbtc tctgatgctc ccttgaagcc 2520 
atagaattta ggggcttttt taaaaaaata aaagaaaaat gaaaccaaac ccaagtgtag 2580 
agggatttgt ctgggctttc cacgaagctt gacctggaac gggcgttgct tccatcccca 2640 
tcctgcctgt ccgggacgag tccggagcgg ctggcggcct ccggtaacag aaaccgactg 2700 
atgaggcgga aggtaaggaa gatggaagca gagggcagag ctgggctctg tctggggaga 2760 
gggcaggaga cgagtgttca cgtaccatgg aaaggggaag tcacacacat gcgacttggc 2820 
cccgggggtc ccgttccccg acactacaca aacatacctg aaagcctcag cgacggggcc 2880 
caggcaggat ggtcctggct gctctgacgg cggaaggcct ccttgactcc ctctgttcac 2940 
gcagcagggc agaaaacatc tccacggggg ccacgacact gtgaagggaa tcagcagtag 3000 
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ctcccagaag aacagcggaa actgcaggca ggtgaagacc ttgcagcact agccccggct 3060 
ccgccccgtg ccttctcccc agacaacacc ccatacccgg cagcaagggt ggaagaccag 3120 
taccaccgta atatgttgtg acaeiagcaga aataatgcac ctgtaagagt cagatggcaa 3180 
gagggaaatg geiatgagctc atcgatggtt ttcccggcag tagcttgggg ataaggacta 3240 
cttgtcatgt gctttatata tttacccaca tgttaa 3276 

<210> 18 
<211> 3910 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eatiire 

<223> Incyte ID No: 7526193CB1 

<400> 18 

tgcggggggc tgggggggga acttagttgt tggcagtttc ttcacgggat gtgtttaaat 60 
tgccgagtcc ccacatacgc gccaccccac aaatictcctt cgaggccgtg gaggccacac 120 
ggctgccgcc tcgccctctc ctccaggagt atgctgggat ttgtagtcca gcagccggac 180 
tgtgccgagc tacctttccc agcttgccct gcggctcggg tgatatcaac agtcttttcc -240 
agaactctgt ctgcactgag accctcttcc cccagtcctc ttctcgcggt cgactccttc 300 
ccatccgtgg cgacagaacg gcggttgcag gagaggcccc ggtccctcgc cgcgccgccc 360 
cgaggggcac ttccggcggc ggttcacttc ctggttgggt ggatggagcc gggcgggagc 420 
gcgcgcgggg gaggggcggc gggtcagtcb ccgcccggcg ctcccgggat cagctggcgg 480 
gcgggcggga gccgagcgcg gccccggctc tcgctgcagc gccgcctctt ctctgcgtcg 540 
caggccggcc cggcggccgt gacaatgtcg cggggctggt agcagggcgc cggccgccga 600 
gccgtctcaa gtttaaactt acacgaatcg ctttctggag gaggagggga cccgctgcgc 660 
gattgacacg catattccta taggcatcct ccctcagccc ccacccccac ggccggattc 720 
gggtggctcc tctccgaggt gaaatctgag aagaaatcct tggatctctt ttcttaaaaa 780 
aaaaaaaaaa aaaaaaaaaa tctagaaacc atcggtattt tgctttgctg ctccctattc 840 
gcaagatgaa gaagtttttc gactcccggc gagagcaggg cggctctggc ctgggctccg 900 
gctccagcgg aggagggggc agcacctcgg gcctgggcag tggctacatc ggaagagtct 960 
tcggcatcgg gcgacagcag gtcacagtgg acgaggtgtt ggcggaaggt ggatttgcta 1020 
ttgtatttct ggtgaggaca agcaatggga tgeiaatgtgc cttgaaacgc atgtttgtca 1080 
acaatgagca tgatctccag gtgtgcaaga gageiaatcca gataatgagg gatctttcag 1140 
ggcacaagaa tattgtgggt tacattgatt ctagtatcaa caacgtgagt agcggtgatg 1200 
tatgggaagt gctcattctg atggactttt gtagaggtgg ccaggt'ggta aacctgatga 1260 
accagcgcct gcaeiacaggc tttacagaga atgaagtgct ccagatattt tgtgatacct 1320 
gtgaagctgt tgcccgcctg catcagtgca aaactcctat tatccaccgg gacctgaagg 1380 
ttgaaaacat cctcttgcat gaccgaggcc actatgtcct gtgtgacttt ggaagcgcca 1440 
ccaacaaatt ccagaatcca caaactgagg gagtcaatgc agtagaagat gagattaaga 1500 
aatacacaac gctgtcctat cgagcaccag aaatggtcaa cctgtacagt ggcaaaatca 1560 
tcactacgaa ggcagacatt tgggctcttg gatgtttgtt gtataaatta tgctacttca 1620 
ctttgccatt tggggaaagt caggtggcaa tttgtgatgg aaacttcaca attcctgata 1680 
attctcgata ttctcaagac atgcactgcc taattaggta tatgttggaa ccagaccctg 1740 
acaaaaggcc ggatatttac caggtgtcct acttctcatt taagctactc aagaaagagt 1800 
gcccaattcc aaatgtacag aactctccca ttcctgcaaa gcttcctgaa ccagtgaaag 1860 
ccagtgaggc agctgcaaaa aagacccagc caaaggccag actgacagat cccattccca, 19 2Q 
ccacagagac ttcaattgca ccccgccaga ggcctaaagc tgggcagact cagccgaacc 1980 
caggaatcct tcccatccag ccagcgctga caccccggaa gagggccact gttcagcccc 2040 
cacctcaggc tgcaggatcc agcaatcagc ctggcctttt agccagtgtt ccccaaccaa 2100 
aaccccaagc cccacccagc cagcctcbgc cgceiaacbca ggccaagdag ccacaggctc 2160 
ctcccactcc acagcagacg ccttctactc aggcccaggg tctgcccgct caggcccagg 2220 
ccacacccca gcaccagcag caactcttcc tcaagcagca acagcagcag caacagccac 2280 
cgccagcaca gcagcagccg gcaggcacgb tttaccagca gcagcaggcc cagactcagc 2340 
agtttcaggc agtacatcca gcaacccagc aaccagcaat tgctcagttc cctgtggtgt 2400 
cccaaggagg ctctcaacag cagctaatgc agaatbtcta ccagcagcag cagcagcagc 2460 
aacaacaaca gcaacagcaa cagcbggcca cagccctgca tcaacaacag cbgabgacbc 2520 
agcaggcbgc cbbgcagcaa aagcccacba bggcagcagg acagcagccc cagccacagc 2580 
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cagctgcagc cccacagcca gcccctgccc aggagccagc gcagattcaa gccccagtaa 2640 
gaceiacagcc aaaggttcag acaaccccac ctcctgccgt ccaggggcag aaagttggat 2700 
ctctcactcc accctcatcc cccaaaaccc aacgtgctgg gcacaggcgt attctcagtg 2760 
acgtaaccca cagtgcagtc tttggggtcc ctgccagcaa atcaiacccag ctgctccagg 2820 
cagctgcagc tgaggccagt ctcaataagt ccaagtctgc aaccaccact ccatcaggct 2880 
ctcctcggac ctctcaacaa aacgtttata atccttcaga agggtctacg tggaatccct 2940 
ttgatgacga taatttctcc aaactcacag ctgciagaact gctaaacaag gactttgcca 3000 
agcttgggga aggcaaacat cccgagaagc ttggaggctc agctgagagt ttgatcccag 3060 
gctttcaatc aacccaaggt gatgcttttg ctacgacctc attttctgct ggaactgaaa 3120 
aactaattga gggactcaaa tctcctgaca cttctcttct gctccctgac ctcttgccta 3180 
tgacagatcc ttttggtagc acttctgatg ctgtaattga aaaagctgat gttgctgttg 3240 
agagtctcat accaggactg gagcccccag ttccccagcg cctcccatct cagacggaat 3300 
ctgtgacctc gaatcgcaca gattctctca ccggggaaga ttccctgctt gattgctctc 3360 
tgctctctaa ccctactact gaccttctgg aagagtttgc ccccacagca atctctgctc 3420 
cagtccataa agctgcagaa gatagtaatc tcatctcagg ttttgatgtc cctgagggct 3480 
cggacsiaggt ggctgaagat gagtttgacc ctattcctgt attgataacc aaaaacccac 3540 
aaggtgggca ctctageiaac agcagtggga gctctgagtc cagtcttccc aacctagcca 3600 
ggtctttact gctggtggat cagctcatag acctgtagcc gtgacccagt agcagatgca 3660 
gttctgtaac cttcataccg taaaatacat tttcattacg gagttatgeui aaaaatgatb 3720 
tttttaaaaa aatctgcaaa taaggggccc tccagccctt ttctcctacc ccttgccttc 3780 
tcctgtagaa atgataaggfa aagaaaatca ctttgggcct ccagatattc cttgggcagt 3840 
tcctccttgt tagtttgctg tgttttctca ttacccttct tcaatagcat tatcttaaat 3900 
caagcactag 3910 

<210> 19 

<211> 4380 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_£eat\ire 

<223> Incyte ID No: 752619 6CB1 

<400> 19 

gccgcggcgg aggggacggg gctaggccgg gtcgccgcct gacgcgacgc gtcctcacgg 60 
gcgcctacgt cacggcgtcg aggcggaaga tggtgcacct ccgggccggc ggttgctgag 120 
ctgacccgga cggcgaggga gcgggagccc gagcccgacc actccggctg ccgcggggtg 180 
cggcgcagcc accgccatgt cgctgctgca gtcggcgctc gacttcttgg cgggtccagg 240 
ctccctgggc ggtgcttccg gccgcgacca gagtgacttc gtggggcaga cggtggaact 300 
gggcgagctg cggctgcggg tgcggcgggt cctggccgaa ggagggtttg catttgtgta 360 
tgaagctcaa gatgtgggga gtggcagaga gtatgcatta aagaggctat tatccaatga 420 
agaggsiaaag aacagagcca tcattcaaga agtttgcttc atgctctgtt cactcggaga 480 
gcccgccggc tgcctgagtg tgggttcggg bggacacagc cacgcctcag cctccctgcg 540 
cacagccccc tgagggccct gcctcctcct gccacgcgcg ggatggactt tggtgtcgct 600 
gtggtcagtg cacagaactg tggacatggt tatgtacgtt ctcctttaaa caagacaact 660 
gcagaaaaag ctttccggcc acccgciacat tgtccagttt tgttctgcag cgtctatagg 720 
aaeiagaggag tcagacacgg ggcaggotga gttcctcttg ctcacagagc tctgtaaagg 780 
gcagctggtg gaatttttga agaaaatgga atctcgaggc .ccccttjtcgt gcgacacgg.t jB4p_. 
tctgaagatc ttctaccaga cgbgccgcgc cgtgcagcac atgcaccggc agaagccgcc 900 
catcatccac agggacctca aggttgagaa cttgttgctt agtaaccaag ggaccattaa 960 
gctgtgtgac tttggcagtg ccacgaccat ctcgcactac cctgactaca gctggagcgc 1020 
ccagaggcga gccctggtgg aggaagagati cacgaggaat acaacaccaa tgtatagaac 1080 
accagsiaatc atagacttgt attccaacti: cccgatcggc gagaagcagg atatctgggc 1140 
cctgggctgc atcttgtacc tgctgtgctt: ccggcagcac ccttttgagg atggagcg€ia 1200 
acttcgaata gtcaatggga agtactcgat ccccccgcac gacacgcagt acacggtctt 1260 
ccacagcctc atccgcgcca tgctgcaggt gaacccggag gagcggctgt ccatcgccga 1320 
ggtggtgcac cagctgcagg agatcgcggc cgcccgcaac gtgaacccca agtctcccat 1380 
cacagagctc ctggagcaga atggaggcta cgggagcgcc acactgtccc gagggccacc 1440 
ccctcccgtg ggccccgctg gcagtggcta cagtggaggc ctggcgctgg cggagtacga 1500 
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ccagccgtat ggcggcttcc tggacattct gcggggtggg acagagcggc tcttcaccaa 1560 
cctcaaggac acctcctcca aggtcatcca gtccgtcgct aattatgcaa agggtgacct 1620 
ggacatatct tacatcacat ccagaattgc agtgatgtca ttcccagcag aaggtgtgga 1680 
gtcagcgctc aaaaacaaca tcgeiagatgt gcggttgttc ctggactcca agcacccagg 1740 
gcactatgcc gtcbacaacc tgtccccgag gacctaccgg ccctccaggt tccacaaccg 1800 
ggtctccgag tgtggctggg cagcacggcg ggccccacac ctgcacaccc tgtacaacat 1860 
ctgcaggaac atgcacgcct ggctgcggca ggaccacaag aacgtctgcg tcgtgcactg 1920 
catggacggg agagccgcgt ctgctgtggc cgtctgctcc ttcctgtgct tctgccgtct 1980 
cttcagcacc gcggaggccg ccgtgtacat gttcagcatg aagcgctgcc caccaggcat 2040- 
ctggccatcc cacaaaaggt acatcgagta catgtgtgac atggtggcgg aggagcccat 2100 
cacaccccac agcaagccca tcctggtgag ggccgtggtc attgacaccc gtgccgcgtg 2160 
ttcagcaagc agaggagcgg ctgcaggccc ttctgcgagg tctacgtggg ggacgagcgt 2220 
gtggccagca cctcccagga gtacgacaag atgcgggact ttaagattga agatggcata 2280 
ggggtgattc ccctgggcgt cacggtgcaa ggagacgtgc tcatcgtcat ctatcacgcc 2340 
cggtccactc tgggcggccg gctgcaggcc aagatggcat ccatgaagat gttccagatt 2400 
cagttccaca cggggtttgt gcctcggaac gccaccactg tgaaatttgc caagtatgac 2460 
ctggacgcgt gtgacattca agaaaeiatac ccggatttat ttcciagtgaa cctggaagtg 2520 
gaggtggagc ccagggacag gccgagccgg gciagccccac catgggagaa ctcgagcatg 2580 
A^Srgggctga accccaaaat cctgttttcc agccgggagg agcagcaaga cattctgtct 2640 
aagtttggga agccggagct tccccggcag cctggctcca cggctcagta tgatgctggg 2700 
gcagggtccc cggaagccga acccacagac tctgactcac cgccaagcag cagcgcggac 2760 
gccagtcgct tcctgcacac gctggactgg caggaagaga aggaggcaga gactggtgca 2820 
gaaaatgcct cttccaagga gagcgagtct gccctgatgg aggacagaga cgagagtgag 2880 
gtgtcagatg aagggggatc cccgatctcc agcgagggcc aggaacccag ggccgaccca 2940 
gagccccccg gcctggcagc agggctggtg cagcaggact tggtttttga ggtggagaca 3000 
ccggctgtgc tgccagagcc tgtgccacag gaagacgggg tcgacctcct gggcctgcac 3060 
tccgaggtgg gcgcagggcc agctgtaccc ccgcaggcct gcaaggcccc ctccagcaac 3120 
accgacctgc tcagctgcct ccttgggccc cctgaggccg cctcccaggg gcccccggag 3180 
gatctgctca gcgaggaccc gctgctcctg gcaagcccgg cccctcccct gagcgtgcag 3240 
agcccccaag aggagggccc cctgccgctg ctgacccctt tggcccgctt ctgccgtctt 3300 
caggcaacaa ctcccagccc tgctccaatc ctgatctctt cggcgaattt ctcaattcgg 3360 
actctgkgac cgtcccacca tccttcccgt ctgcccacag cgctccgccc ccatcctgca 3420 
gcgccgactt cctgcacctg ggggatctgc caggagagcc cagcaagatg acagcctcgt 3480 
ccagcaaccc agacctgctg ggaggatggg ctgcctggac cgagactgca gcgbcggcag 3540 
tggcccccac gccagccaca gaaggccccc tcttctctcc tggaggtcag ccggcccctt 3600 
gtggctctca ggccagctgg accaagtctc agaacccgga cccabttgct gaccttggcg 3660 
acctcagctc cggcctccaa gacccccaag cccagagcac agtigagccca aggggacagc 3720 
gtgtctgcac ctgttccagg cgactgccaa ctggcaagct aaaaccggga gttgctgaca 3780 
ctggcactgc tgccagcccc caccggcatt gtggctcacc agctggattc cctcctgggg 3840 
gcttcattcc caaaacggcc accacgccca aaggcagcag ctcctggcag acaagtcggc 3900 
cgccagccca gggcgcctca tggccccctc aggccaagcc gccccccaaa gcctgcacac 3960 
agccaaggcc taactatgcc tcgaacttca gtgtgatcgg ggcgcgggaa gagcgggggg 4020 
tccgcgcacc agctttgctc aaaaccaaaa gtctctgaga acgactttga agattgttgt 4080 
ccaataaggc ttctcctcca ggtctgaaag aaagggcaaa gacattgcag agatgaggag 4140 
aggacctggc taaagacacg gacccactca agctgaagct cctggactgg attgagggca 4200 
aggagcggaa catccgggcc ctgctgtcca cgctgcacac agtgctgtgg gacggggaga 4260 
gccgctggac gcccgtgggc atggccgacc tggtggctcc ggagcaagtg aagaagcact 4320 
atcgccgcgc ggtgctggct gtgcaccccg acaaggtgag cagagctgcc aggcggjccgq. 433QL 

<210> 20 

<211> 4293 

<212> DMA 

<213> Hcamo sapiens 

<220> 

<221> iaisc_featuro 

<223> Incyte ID No: 7526198CB1 

<400> 20 
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gccgcggcgg aggggacggg gctaggccgg gtcgccgcct gacgcgacgc gtcctcacgg 60 
gcgcttacgt cacggcgtcg aggcggaaga tggtgcacct ccgggccggc ggttgctgag 120 
ctgacccgga cggcgaggga gcgggagccc gagcccgacc actccggctg ccgcggggtg 180 
cggcgcagcc accgccatgt cgctgctgca gtcggcgctc gacttcttgg cgggtccagg 240 
ctccctgggc ggtgcttccg gccgcgacca gagtgacttc gtggggcaga cggtggaact 300 
gggcgagctg cggctgcggg tgcggcgggt cctggccgaa ggagggtttg catttgtgta 360 
tgaagctcaa gatgtgggga gtggcagaga gtatgcatta aagaggctat tatccaatga 420 
agaggaaaag aacagagcca tcattcaaga agtttgcttc atgaaaaagc tttccggcca 480 
cccgaacatt gtccagtttt gttctgcagc gtctatagga aaagaggagt cagacacggg 540 
gcaggctgag ttcctcttgc tcacagagct ctgtaaaggg cagctggtgg aatttttgaa 600 
gaaaatggaa tctcgaggcc ccctttcgtg cgacacggtt ctgaagatct tctaccagac 660 
gtgccgcgcc gtgcagcaca tgcaccggca gaagccgccc atcatccaca gggacctcaa 720 
ggttgagaac ttgttgctta gtaaccaagg gaccaCtaag ctgtgtgact ttggcagtgc 780 
cacgaccatc tcgcactacc ctgactacag ctggagcgcc cagaggcgag ccctggtgga 840 
ggaagagatc acgaggaiata caacaccaat gtatagaaca ccagaaatca tagacttgta 900 
ttccaacttc ccgatcggcg agaagcagga tatctgggcc ctgggctgca tcttgtacct 960 
gctgtgcttc cggcagcacc cttttgagga tggagcgaaa cttcgaatag tcaatgggaa 1020 
gtactcgatc cccccgcacg acacgcagta cacggtcttc cacagcctca tccgcgccat 1080 
gctgcaggtg aiacccggagg agcggctgtc catcgccgag gtggtgcacc agctgcagga 1140 
gatcgcggcc gcccgcaacg tgaiaccccaa gtctcccatc acagagctcc tggagcagaa 1200 
tggaggctac gggagcgcca cactgtcccg agggccaccc cctcccgtgg gccccgctgg 1260 
cagtggctac agtggaggcc tggcgctggc ggagtacgac cagccgtatg gcggcttcct 1320 
ggacattctg cggggtggga cagagcggct cttcaccaac ctcaaggaca cctcctccaa 1380 
ggtcatccag tccgtcgcta attatgcaaa gggtgacctg gacatatctt acatcacatc 1440 
cagaattgca gtgatgtcat ticccagcaga aggtgtggag tcagcgctca aaaacaacat 1500 
cgaagatgtg cggttgttcc tggactccaa gcacccaggg cactatgccg tctacaacct 1560 
gtccccgagg acctaccggc cctccaggtt ccacaaccgg gtctccgagt gtggctgggc 1620 
agcacggcgg gccccacacc tgcacaccct gtacaacatc tgcaggaaca tgcacgcctg 1680 
gctgcggcag gaccacaaga acgtctgcgt cgtgcactgc atggacggga gagccgcgtc 1740 
tgctgtggcc gtctgctcct tcctgtgctt ctgccgtctc ttcagcaccg cggaggccgc 1800 
cgtgtacatg ttcagcatga agcgctgccc accaggcatc tggccatccc acaaaaggta 1860 
catcgagtac atgtgtgaca tggtggcgga ggagcccatc acaccccaca gcaagcccat 1920 
cctggtgagg gccgtggtca tgacacccgt gccgctgttc agcaagcaga ggagcggctg 1980 
caggcccttc tgcgaggtct acgtggggga cgagcgtgtg gcceigcacct cccaggagta 2040 
cgacaagatg cgggacttta agattgaaga tggcatagcg gtgattcccc tgggcgtcac 2100 
ggtgcaagga gacgtgctca tcgtcatcta tcacgcccgg tccactctgg gcggccggct 2160 
gcaggccaag atggcatcca tgaagatgtt ccagattcag ttccacacgg ggtttgtgcc 2220 
tcggaacgcc accactgtga aatttgccaa gtatgacctg gacgcgtgtg acattcaaga 2280 
aaaatacccg gatttatttc aagtgaacct ggaagfcggag gtggagccca gggacaggcc 2340 
gagccgggaa gccccaccat gggagaactc gagcatgagg gggctgaacc ccaaaatcct 2400 
gttttccagc cgggaggagc agcaagacat tctgtctaag tttgggaagc cggagcttcc 2460 
ccggcagcct ggctccacgg ctcagtatga- tgctggggca gggtccccgg aagccgaacc 2520 
cacagactct gactcaccgc caagcagcag cgcggacgcc agtcgcttcc tgcacacgct 2580 
ggactggcag gaagagaagg aggcagagac tggfcgcagaa aatgcctctt ccaaggagag 2640 
cgagtctgcc ctgatggagg acagagacga gagtgaggtg tcagatgaag ggggatcccc 2700 
gatctccagc gagggccagg aacccagggc cgacccagag ccccccggcc tggcagcagg 2760 
gctggtgcag caggacttgg tttttgaggt ggagacaccg gctgtgctgc cagagcctgt 2820 
, . gccacaggaa gacggggtcg acctcctggg cctgcactcc gaggtgggcg cagggccagc 2880 . 
tgtacccccg caggcctgca aggccccctc cagcaacacc gacctgctca gctgcctcct 2940 
tggrgccccct gaggccgcct cccaggggcc cccggaggat ctgctcagcg aggacccgct 3000 
gctcctggca agcccggccc ctcccctgag cgtgcagagc accccaagag gagggccccc 3060 
tgccgctgct gacccctttg gcccgcttct grccgtcttca ggcaacaact cccagccctg 3120 
ctccaatcct gatctcttcg gcgaatttcb caattcggac tctgtgaccg tcccaccatc 3180 
cttcccgtct gcccacagtg ctccgccccc atcctgcagc gccgacttcc tgcacctggg 3240 
ggatctgcca ggagagccca gcaagatgac agcctcgtcc agcaacccag acctgctggg 3300 
aggabgggct gcctggaccg agactgcagc atcggcagtg gcccccacgc cagccacaga 3360 
aggccccctc ttctctcctg gaggtcagcc ggccccttgt ggctctcagg ccagctggac 3420 
caagtctcag aacccggacc catttgctga ccttggcgac ctcagctccg gcctccaaga 3480 
cccccaagcc cagagcacag tgagcccaag gggacagcgt gtctgcacct gttccaggcg 3540 
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actgccaact ggcaagctaa aaccgggagt tgctgacact ggcactgctg ccagccccca 3600 
ccggcattgt ggctcaccag ctggattccc tcctgggggc ttcattccca asiacggccac 3660 
cacgcccaaa ggcagcagct cctggcagac aagtcggccg ccagcccagg gcgcctcatg 3720 
gccccctcag gccaagccgc cccccaaagc ctgcacacag ccaaggccta actatgcctc 3780 
gaacttcagt gtgatcgggg cgcgggagga gcggggggtc cgcgcaccca gctttgctca 3840 
aaagccaaaa gtctctgaga acgactttga agate tgttg tccaatcaag gcttctcctc 3900 
caggtctgac aagaaagggc caaagaccat tgcagagatg aggaagcagg acctggctaa 3960 
agacacggac ccactcaagc tgaagctcct ggactggatt gagggcaagg agcggaacat 4020 
ccgggccctg ctgtccacgc tgcacacagb gctgtgggac ggggagagcc gctggacgcc 4080 
cgtgggcatg gccgacctgg tggctccgga gcaagtgaag aagcactatc gccgcgcggt 4140 
gctggccgtg caccccgaca aggctgcggg gcagccgtac gagcagcacg ccaagatgat 4200 
Gttcatggag ctgaatgacg cctggtcgga gtttgagaac cagggctccc ggcccctctt 4260 
ctgaggccgc agtggtggtg gctgcgcaca cag 4293 

<210> 21 

<211> 6538 

<212> mOi 

<213> Homo sapiens 

<220> 

<221> misc_£eature 

<223> Xncyte ID No: 7526208CB1 

<400> 21 

ggagttactc agaagggaag ggaaggtgtg gttgtgcggc ggagtttttg ctttcattct 60 
tttaacgttc acagccaaag caaaggcctt tggggattgc cagagtctca gccaccatcc 120 
tggaaaacag cgggggaggt gggcctggag gtggcaagtg taatgtggct caggggccgt 180 
cattgcccct tgcagaaggg gctgcggggg agggagaaaa cctgcgcccg gttctgggga 240 
gctggcgacg cagtgeiaccc tgctgaggct gggttttgcc ccgacagtcg ctggtggctg 300 
tgggaagggt tgggaccctb ctctgagagc agtgaacagc ccacatccgg cccctgctgt: 360 
gtcaactctg agcggcgtgg agatgaagtg gttgctctcc cttgctcggc ccaccgggtg 420 
tcgtggcccg ggaaccggcc tggagaagbc cctgctgccc ggcgcccaaa acaggggcgt 480 
gggcttccgc gacccagggc ggctgccccg ggccatcctc gagttgccct gcatcttccc 540 
gctcagtcag ccccagattg aggcagcctt ctctgtgcgg gtttaaatgg gtaactgtga 600 
cttctcgcct: cattcaccca aacctccagt cttctccccc gcacatcctc ctccacccac 660 
ctggtttctc cctagactgg tgtgctcgtg tgtgcaacca aaggagggag tgcgagagat 720 
ccacgaaggg acaggcttgg agtcgctaga gggaggtgtg ggaccagcga ggagggggct 780 
tcgccaggga gggggtgctg gcaggcggag ggagcggcgg gaggaggcgc cggaggagga 840 
gacggaggcc tggggacggc agaagaggct tcgcctgagc cgagcgctct ttctctcgcc 900 
gcgccgtctt gaagccgcgc gggctcgtga gcagcgcgag gccgccaagg tgcctcgctt 960 
cgccggagcc gctgccgccc gccggaggga agccggcctc gggcgcgcac gctcgtcgga 1020 
gccccggcgc gccccgcgcc tgagcctgct gacagcgccg cggggccagt cccggggtta 1080 
gccgcgcgtc tgctcgcttc tggtccgtcg cgctcccagc cagggcacag cccggaccga 1140 
ggatggcttc gaccacaacc tgcaccaggt tcacggacga gtatcagctt ttcgaggagc 1200 
ttggaaaggg ggcattctca gtggtgagaa gatgtatgaa aattcctact ggacaagaat 1260 
atgctgccaa aattatcaac accaaaaagc tttctgctag ggtgcgactt catgatagca 1320 
tatcagaaga gggctttcac tacttggtgt ttgatttagt tactggaggt gaactgtttg 1380 
aagacatagt ggcaagagaa tactacagtg aagctgatgc cagtcattgc attcagcaga.l4;^.0 
tcctggaggc tgtgctacac tgccatcaga tgggcghggt ccatcgggac ctgaagccbg 1500 
agaatbtgct tttagcbagc aaatccaagg gagcagcbgt gaaattggca gactttggct 1560 
tagccataga agttcaaggg gaccagcagg cgbggbbbgg ttbtgcbggc acacctggab 1620 
abcttbcbcc agaagbttba cgtaaagatc ctbatggaaa gccagtggat abgbgggcab 1680 
gbggtgtcab tctctabatb ctacbbgbgg ggbabccacc cttctgggat gaagaccaac 1740 
acagactcba bcagcagabc aaggcbggag cbtatgattt bccatcacca gaabgggaca 1800 
cggbgacbcc bgaagccaaa gacctcabca ataaaabgcb bactabcaac ccbgccaaac 1860 
gcatcacagc cbcagaggca cbgaagcacc catggabctg bcaacgttcb actgttgcbt 1920 
ccatgabgca cagacaggag acbgbagacb gcttgaagaa abbtaatgct agaagaaaac 1980 
baaagggbgc catcbbgaca actabgcbgg cbacaaggaa tbbcbcagca gccaagagtb 2040 
bgbbgaagaa accagatgga gtaaagaaaa ggaagbccag bbcgagbgbb cagabgatgg 2100 
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agtcaactga gagttcaaat acaaccuittg aggatgaaga tgtggaagca cgaaagcaag 2160 
agattatcaa agtcactgaa ceiactgatcg aagctatcaa caatggggac tttgaagcct 2220 
acacaaaaat ctgtgaccca ggccttactg cttttgaacc tgaagctttg ggtaatttag 2280 
tggaagggat ggattttcac cgattctact ttgaaaatgc tttgtccaaa agcaataaac 2340 
caatccacac tattattcta aaccctcatg tacatctggt aggggatgat gccgcctgca 2400 
tagcatatat taggctcaca cagtacatgg atggcagtgg aatgccaaag acaatgcagt 2460 
cagaagagac tcgtgtgtgg caccgccggg atggaaagtg gcagaatgtt cattttcatc 2520 
gctcggggtc accaacagta cccatcaagc caccctgtat tccaaatggg aaagaaaact 2580 
tctcaggagg cacctctttg tggcaaaaca tctaaggcct gaaaaccatt cacatatggg 2640 
tcttctaaat ttcaacagtg ccacttctgc attctctgtt ctcaaggcac ctggatggtg 2700 
accctgggcc gtcctctcct cctcttcatg catgtttctg agtgcatgaa gttgtgaagg 2760 
tcctacatgt aatgcatatg tgatgcatca tcttatcata tattccttcc tatacattgt 2820 
ttacacttca actacgggga tgttccacac aaacttaaat. tactgttggc aaaacaatag 2880 
ggggagatta gacaaaaaaa aaaatccaca atattccaag tacaactctb catcaagttt 2940 
ctctgttaat gccaagattt aacagactta agaactattg ttctctgaat gacagttgta 3000 
agagaaatgt aaatttttta geiactctttg ctgttaatct: gttttggttt gtttggtttt 3060 
tttttttttt tttaaggbaa aaaaaaaata caccttcagt tbcctggtgt gatcctggtt 3120 
eiaaatggatg atttttcatt gaaagttttg ctgattaaca atteiaagtgg gatgatatgt 3180 
gggcaaaatc acttatgaaa gtagaagcaa gaatcagttg gtttgctacc acabaaagcc 3240 
atgctgtttt tggtcaaac.t gtgtaaactg gaaaaattca catcatttct gagtfctaatc 3300 
actttaggat atattcacat tgttttggtg aatttgctga attgaattgt ttttctttct 3360 
caaatctgtg atctcttttc tttatcctgt ttctttgttc ctttcgtttg ctttcttatt 3420 
tttcttttgg ttccattctt ttcttacttt tttccctttt ccttttttgg ggaggctggc 3480 
tagtagtgtg tgagaaaaga atagaagtga aatttgcata atgaatgtaa aagggaaata 3540 
aaagtctttt: gaaggtagct atactagcac ttttgatcat cttcagggcc cacaaaaatg 3600 
ttgtcaagat tttaaaggtt tataattctg cttaagctct agtttggact taggtatcct 3660 
aactatgttg gaggtatttg cattgtttaa agttaggata aaagcaagtt cctcctgtga 3720 
ctgcaacgtc ttactgattg ggacagttgc caggaggata ccaacttgat agcagagggg 3780 
gttttatgca aacgcactca cctccgcctt ggggaatgaa . agggtcactt ctgcatcatc 3840 
actagctagt tttctagtgt tagagaggct tacaaatgtt tgccattctc ataagtgttt 3900 
tigaacttgat ctttgtgact tgtgcttttt ttagcttctc tctbgaatca gagta t cat t 3960 
gbcttcctcc aaggagttag aatttcccag tttaaaacaa aaagggaaat gtcctaggtt 4020 
t:tctttgtgc btctcatttt tccbbbgbbg abbcaatbcc bgbgabbbbb gbbcbcbbcc 4080 
cbgaagbgcb bbacagbgca tggaatcbcc abcabbgbta bbtbaacgat agbaabbcac 4140 
agbccbcaga agccbabbbt: taaagcagaa gcaaaaaaga aaaacaaaab aacaaaaaca 4200 
acccbbcctc bbbbcbcbca bcbcaccbcb ctgbgttgat tacbaabcab cbbagababb 4260 
abbgcbagbg gabgtabggb agatgggbbg aagcbbbbcb gataattabt acacaatbta 4320 
aaacaacaba batabtbaaa ataaababab acagbaaaba babbgagcca bgbbaaccbg 4380 
ccaabgagab cbgbgaaaaa abaabggccb cabbbbbcbc bbbbbaabbb cbbbbacccb 4440 
bbbgbgaagc agcbabacgb ggcabacabg babbbaaaga aaaaaaaaba gabgbagagb 4500 
gbbbbbbbba cacbbbbaac bbagcabgbg gbgbbgaagb abbacbgbag abcaagbbbg 4560 
bcbbccgcac baagabgbga ggaaabbgbg abbbgbbcbc bccaccacaa abgaabbaca 4620 
cabbbabbab cbbcbabcab bbbgaaacac bgcagbbbac cabgggacac bgbabababb 4680 
bcbbgccaba abggbaaagg acbgabbgab ababbbaaga gbbaabaaab bbgbgabbbc 4740 
bgcbgacagb gcgbccabcb bbabbbcbbc agaagaggba cbgbabgbab gccbgcabag 4800 
bgcbggccag bgbcaagggc agbgbgbccb acbcbggbcb cabbbagbac abaacaabbb 4860 
gcacbbggbg aggaggacaa tabagbbaac aacbaagacb ccbaaaagcb bcbcbaaacb 4920 
gbacccbcca abcc.agccbt cacabggcbg cbbbbbbbbb bbbbbbbaab acgaaccb.gjt: 498A_. 
Gcbbgbaaca cbbbgabgbt abcabbbcbg ggabacaggc siagcacccca gcbccbgcba 5040 
cbcGccagcb bgaacbbgag cabacabgga bgcbcagcbb cbbbbgabbb gcbaaaaaca 5100 
bcacacbbgc bcacabgcct gbbbabgcbg bbcabgbbgb bbatgbbbcb baccbagaab 5160 
aaabagbcbc bbccccbacb bcbbbbccbg acbbcbtacb bbbbccbaag abbcagbgba 5220 
cagcabcabg cbccacagca aaccbbccba ggcccbabbc bgggcbbgcc btcccbcbca 5280 
Eiaaccbacab aabagabbgb abbbaccbcb ccbgbcaacc acabbgbbbb gaaaababab 5340 
bbcbabbbgb gbcbccbcba cbgcagbaba abgbcbccab gggcaagaac bgbgbabbca 5400 
bcabbgcabb ccbaaaccca aaccaaggcc aggaabggag ababcabbga baaabagbbg 5460 
bbgaabbgag gccaagcccb bbbgabaaca gaagccbcaa ggggbaccca gabagbccbb 5520 
gbbbbaabga bgggbbcbcb caccacbgbc bbgabgcbcb gagcaagbba ccbcbbccc^b 5580 
cbgacccbca gbbbccabab bbgbaaaabg agaabaaaca baccaacbba abaaagabab 5640 
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tgtgaggatt eiatgggtaca gagtgactag 
agtataacta ttctgatcac tgacattaat 
gggtatacaa ctcttgtttt gctgttgggc 
ctgaggatat actcggactc aaatgtctca 
gtcatctagc tgaatgtagg aaaaatgaaa 
attttcaagg aatattaatc ttcacccaaa 
taagggtaaa catagatgac acagctttcg 
ttctatgaac ctttcatggc tctgtggtct 
acaaaaagat tacsiatgaag atcaggtact 
gctgtcataa accaccctaa cctgagawg 
ttctctcaga caaeiaccaat tctcagaaga 
aggaaaagtt gataacagat taagcagtaa 
tttgctgctt tacaacagaa aaaaaaeiatt 
ttttttaaat gtcaccaaca tttaaaaatt 
gctteiattaa tttgaaaaag tgtaaggtct 

<210> 22 

<211> 2349 

<212> DNA 

<213> Homo sapiens 



aatgatattt gatagaaatt aaatggtagc 5700 
attcctattg ttattattct ttactcacga 5760 
tgccctcttt atgtaggttt actgttaatg 5820 
gcagaaggct gagagacacc aaatgaagtg 5880 
tgtagtagca aatcagtata ttctaaggaa 5940 
ttttgaattt ttatgtaaaa aattataatt. 6000 
agtgatttca ttgaataaaa ttctactgac 6060 
ttttatcaga ttttttaaag gtgagaatgt 6120 
agaccatgtg tccatgaacg tgaacaaaca 6180 
cagcaggaag catttacagc attcctgctt 6240 
gagctagaat gttctcctgc agactggagt 6300 
ttgtactcca gaaggatttg catttaggct 6360 
cttgtttgtc cgtaaaaagt gtttttatgt 6420 
ggatatgtca tgtaaaagtc aagatttctg 6480 
gccccactgg ttctgtgttc actacagc 6538 



<220> 

<221> misc^feature 

<223> Incyte ID No: 7526212C31 



<400> 22 
ggagttactc 
tttaacgttc 
tggaaaacag 
cattgcccct 
gctggcgacg 
tgggaagggt 
gtcaactctg 
tcgtggcccg 
gggcttccgc 
gctcagtcag 
cttctcgcct 
ctggtttctc 
ccacgaaggg 
tcgccaggga 
gacggaggcc 
gcgccgtctt 
cgccggagcc 
gccccggcgc 
gccgcgcgtc 
ggatggcttc 
ttggaaaggg 
atgctgccaa 
tatcagaaga 
aagacatagt 
tcctggaggc 
agaatttgct 
tagccataga 
atctttctcc 
gtggtgtcat 
acagactcta 
cggtgactcc 
gcatcacagc 
ccatgatgca 



agaagggaag 
acagccaaag 
cgggggaggt 
tgcagaaggg 
cagtgaaccc 
tgggaccctt 
agcggcgtgg 
ggaaccggcc 
gacccagggc 
ccccagattg 
cattcaccca 
cctagactgg 
acaggcttgg 

gggggtgctg 
tggggacggc 
gaagccgcgc 
gctgccgccc 
gccccgcgcc 
tgctcgcttc 
gaccacaacc 
ggcattctca 
aattabcaac 
gggctttcac 
ggcaagagaa 
tgtgctacac 
tttagctagc 
agttcaaggg 
agaagtttta 
tctctatatt 
tcagcagatc 
tgaagccaaa 
ctcagaggca 
cagacaggag 



ggaaggtgtg 
caeiaggcctt 
gggcctggag 
gctgcggggg 
tgctgaggct 
ctctgagagc 
agatgaagtg 
tggagaagtc 
ggctgccccg 
aggcagcctt 
aacctccagt 
tgtgctcgtg 
agtcgctaga 
gcaggcggag 
agaagaggct 
gggctcgtga 
gccggaggga 
tgagcctgct 
tggtccgtcg 
tgcaccaggt 
gtggtgagaa 
accaaaaagc 
tacttggtgg 
tactacagtg 
tgccatcaga 
aaatccaagg 
gaccagcagg 
cgtaaagatc 
ctacttgtgg 
aaggctggag 
gacctcatca 
ctgaagcacc 
actgtagact 



gttgtgcggc 
t:ggggattgc 

gtggcaagtg 
agggagaaaa 
gggttttgcc 
agtgaacagc 
gttgctctcc 
cctgctgccc 
ggccatcctc 
ctctgtgcgg 
cttctccccc 
tgtgcaacca 
gggaggtgtg 
ggagcggcgg 
tcgcctgagc 
gcagcgcgag 
agccggcctc 
gacagcgccg 
cgctcccagc 
tcacggacga 
gatgtatgaa 
tttctgctag 
ttgatttagt 
aagctgatgc 
tgggcgtggt 
gagdagctgt 
cgtggtttgg 
cttatggaaa 
ggtatccacc 
cttatgattt 
ataaaatgct 
catggatctg 
gcttgaagaa 



ggagtttttg 
cagagtctca 
taatgtggct 
cctgcgcccg 
ccgacagtcg 
ccacatccgg 
cttgctcggc 



ggcgcccaaa 
gagttgccct 
gtttaaatgg 
gcacatcctc 
aaggagggag 
ggaccagcga 
gaggaggcgc 
cgagcgctct 
gccgccaagg 
gggcgcgcac 
cggggccagt 
cagggcacag 
gtatcagctt 
aat tec tact 
ggtgcgactt 
tactggaggt 
cagtcattgc 
ccatcgggac 
gaaattggca 
ttttgctggc 
gccagtggat 
cttctgggat 
tccatcacca 
tactatcaac 
tcaacgttct 
atttaatgct 



ctttcattct 60 
gccaccatcc 120 
caggggccgt 180 
gttctgggga 240 
ctggtggctg 300 
cccctgctgt 360 
ccaccgggtg 420 
acaggggcgt 480 
gcatcttccc 540 
gtaactgtga 600* 
ctccacccac 660 
tgcgagagat 720 
ggagggggct 780 
cggaggagga 840 
ttctctcgcc 900 
tgcctcgctt 960 
gctcgtcgga 1020 
cccggggtta 1080 
cccggaccga 1140 
ttcgaggagc 1200 
ggacaageiat 1260 
.catgatagc8L^320. . 
gaactgtttg 1380 
attcagcaga 1440 
ctgaagcctg 1500 
gafctttggct 1560 
acacctggat 1620 
atgtgggcat 1680 
gaagaccaac 1740 
gaatgggaca 1800 
cctgccaaac 1860 
actgttgctt 1920 
agaagaaaac 1980 
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taaagggtgc catcttgaca actatgctgg ctacaaggaa bttctcagca gccaagagtt 2040 
tgttgaageia accagatgga gtaaaggagt caactgagag ttcaaataca acaattgagg 2100 
atgaagatgt gaaaggcacg gtggctcacg cctgtaatcc cagcactttg ggaggtcgag 2160 
Srcgggcagat cacctgaggt caggagttca agaccagcat ggccaacatg gtgaaaccct 2220 
gtctctacta aaaatacaaa aattagctgg gtgtggtggc aggcacctgt aatcccagct 2280 
actctggagg ctgagacagg agsiatcgctt gaacccggga ggtggaggtt gcagtgagcc 2340 
gagatcaca 2349 

<210> 23 

<211> 8015 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> iaisc_feat\ire 

<223> Incyte ID No: 7526213CB1 

<400> 23 

agccgggcag ctgcagcgga gccgcggagc gggcggcggg gcccaggctg tgcgcttggg 60 
gagcgcggaa tgtgaggett ggcgggccgc agcacgctcg gacgggccag gggcggcgac 120 
ccctcgcgga cgcccggctg cgcgccgggc cggggacttg cccttgcacg ctccctgcgc 180 
cctccagctc gccggcggga ccatgaagaa gttctctcgg atgcccaagt cggagggcgg 240 
cagcggcggc ggagcggcgg gtggcggggc tggcggggcc ggggccgggg ccggctgcgg 300 
ctccggcggc tcgtccgtgg gggtccgggt gttcgcggtc ggccgccacc aggtcaccct 360 
ggaagagtcg ctggccgaag tgatacagat gctgccggtt caggaaccac gtcttgagta 420 
ccgagtacca ctgatttcga gcgggcgaag aagactaaga agaaggtgct agagaggtgg 480 
attctccaca gttttcctcg tgcgtactca cggtggaatc cgatgtgcat tgaagcgaat 540 
gtatgtcaat aacatgccag acctcaatgt ttgtaaaagg gaaattacaa ttatgaaaga 600 
gctatctggt cacaaaaata ttgtgggcta tttgggctgt gctgttaatt caattagtga 660 
taatgtatgg gaagtcctta tctitaatgga atattgtcga gctggacagg tagtgaatca 720 
aatgaataag aagctacaga cgggttttac agaaccagaa gtgttacaga tattctgtga 780 
tacctgtgaa gctgttgcaa ggttgcatca ^tgtaagact cceiataattc accgggatct 840 
gaaggtageia aatattttgt tgaatgatgg tgggaactat gtactttgtg actttggcag 900 
tgccactaat aaatttctta atcctcaaaa agatggagtt euitgtagtag eiagaagaaat 960 
taaiaaagtat acaactctgt catacagagc ccctgaaatg atcaaccttt atggagggaa 1020 
acccatcacc accaaggctg atatctgggc actgggatgt ctactctata aactttgttt 1080 
cttcactctt ccttttggtg agagtcaggt tgctatctgt gatggcaact tcaccatccc 1140 
agacaattct cgttactccc gtaacataca ttgcttaata agg^tcatgc ttgaaccaga 1200 
tccggaacat agacctgata tatttcaagt gtcatatttt gcatttaaat ttgccaaaaa 1260 
ggattgtcca gtctccaaca tcaataattc ttctattcct tcagctcttc ctgaaccgat 1320 
gactgctagt gaagcagctg ctaggsiaaag ccaaataaaa gccagaataa cagataccat 1380 
tggaccaaca gaaacctcaa ttgcaccaag acaaagacca aaggccaact ctgctactac 1440 
tgccactccc agtgtgctga ccattcaaag ttcagcaaca cctgttaaag tccttgctcc 1500 
tggtgaattc ggtaaccata gaccaaaagg ggcactaaga cctggaaatg gccctgaaat 1560 
tttattgggt cagggacctc ctcagcagcc gccacagcag catagagtac tccagcaact 1620 
acagcaggga gattggagat tacagcaact ccatttacag catcgtcatc ctcaccagca 1680 
gcagcagcag cagcagcagc aacagcaaca gcagcagcag caacagcaac agcagcagca 1740 
gcagcagcag cagcagcagc agcaccacca ccaccaccac cacactactt caagatgptjt,.1800_. 
atatgcagca gtatcaacat ggcaacacag cagcaacaga tigcttcaaca acaattbtta 1860 
atgcattcgg tatatcaacc acaaccttct gcatcacagt: atcctacaat gatgccgcag 1920 
tiatcagcagg ctttctttca acagcagatg ctagctcaac atcagccgtc tcaacaacag 1980 
gcatcacctg aatatcttac ctcccctcaa gagttctcac cagccttagt ttcctacact 2040 
tcatcacttc cagctcaggt tggaaccata atggactcct cctatagtgc caataggtca 2100 
gttgctgata aagaggccat tgcaaatttc acaaatcaga agaacatcag caatccacct 2160 
gatatgtcag ggtggaatcc ttttggagag gataatttct ctaagttaac agaagaggaa 2220 
ctattggaca gagaatttga ccttctaaga tcaagttctc ctgaaaagaa agctgaacat 2280 
tcatctataa atcaagaaaa tggcactgca aaccctatca agaacggtaa aacaagtcca 2340 
gcatctaaag atcagcggac tggaaagaaa acctcagtac agggtcaagt gcaaaagggg 2400 
aatgatgaat ctgaaagtga ttttgeiatca gatccccctt ctcctaagag cagtgaagag 2460 



26 



PF-1724 P 



gaagagcaag atgatgaaga agttcttcag ggggaacaag gagatttteia tgatgatgat 2520 
actgaaccag aaaatctggg tcataggcct ctcctcatgg attctgaaga tgaggaagaia 2580 
gaggagaaac atagctctga ttctgattat gagcaggcta aagcaaagta cagtgacatg 2640 
agctctgtct acagagacag atctggcagt ggacceiaccc aagatcttaa tacaatactc 2700 
ctcacctcag cccaattatc cfcctgatgtt gcagtggaga ctcccaaaca ggagtttgat 2760 
gtatttggcg ctgtcccctt ctttgcagtg cgtgctcaac agccccagca agaaaagaat 2820 
gaaaagaacc tccctcaaca caggtttcct gctgcaggac tggagcagga ggaatttgat 2880 
gtattcacaa aggcgccttt tagcaagaag gtgaatgtac aagaatgcca tgcagtgggg 2940 
cctgaggcac atactatccc tggttatccc aaaagtgtag atgtatttgg ctccactcca 3000 
tttcagccct tcctcacatc aacaagtaaa agtgaaagca atgaggacct ttttgggctt 3060 
gtgccctttg atgaaataac ggggagccag cagcaaaaaa gtca^cagc gcagcttaca 3120 
geuiactgtcc tctcgccaaa ggcgcacaAa gcaggatatg tccaaeuigta atgggaagcg 3180 
gcatcatggc acgccaacta gcacaaagaa gactttgaag cctacctatc gcactccaga 3240 
gagggctcgc aggcacaaaa siagtgggccg cgagactctc aeiagtagcaa tgeiattttta 3300 
accatctcag actccaagga gaacattagb gtitgcactga ctgatgggaa agataggggg 3360 
aatgtcttac aacctgagga gagcctgtbg gaccccttcg gtgccaagcc cttccattct: 3420 
ccagacctgt catggcaccc tccacatcag ggcctgagcg acatccgtgc tgatcacaat 3480 
actgtcctgc cagggcggcc aagacaaaat tcactacatg ggtcattcca tagtgcagat 3540 
gtattgciaaa tggatgattfc tggtgccgtg ccctttacag aacttgtggt gcaaagcatc 3600 
actccacatc agtcccaaca gtrcccaacca gtcgaattag acccatttigg tgctgctcca 3660 
tttccttcta aacagtagat acttctgatg gattctcggc attaactcct gtttcaaaaa 3720 
agtgtgaaca gttttatgaa tttgaaagaa aatttggtag ctctttatag cattcattct 3780 
taaagatcag tcagaatagg tgatttctaa ataaaccaaa tagaagaatg aagtatctct 3840 
acagggtagt aacttgattc ctcttcagga gaaaagggag ctaaattgca agctctaact 3900 
aagggtttct gctactgaca tcacaacaca gaaatgcaag tgtggtactt ccagtgaaag 3960 
cacatggcac ctttctaggt gtgtagccac tgagaaggga cagtgaaact: gttatttttg 4020 
atatcagaat gtcattttta . tgtgcatatc cctaaaatta gggttatttc tacatacact 4080 
agttacactt gtgaattttt tttaaggtct cttttaattt ccagacagtt: aaaaacaatc 4140 
tagttatctt eiaagcattag aaagttatta tctggagagt gcagagattt cagtccatac 4200 
acctttctcc acaaagcaga gccagaagta actgactatt gtgcctaaaa ctctgtttca 4260 
tttttaaaaa caagtgccat taaaatggaa tatctaatga taagcatatg aaataatgtg 4320 
taattagctc aatttaacta ttccacaact tacatattcc siaaacaatgti tatacatgat 4380 
aaatatatat aatttttgtc agttasiaaca aattaaaaaa atggactatc gtcgcacaga 4440 
agcctagaac aaaaatatga agagaaatat ctgacatttg taaageiaatti ataagaagaa 4500 
aaaaagatac agaacagaaa acattcacta ctttagaaac actttatgca tggctkcttg 4560 
ccccaaactt ttattgtgat ggccctaata aagcagatta ttggaaaaab tggaggacaa 4620 
gggttgtata aaaattttat tttatgaaga aaatatgtag cggaaactga attttcaaga 4680 
catttacaat gtgaaatcat gt:tgcattta acaatgtact ttattagcaa cttcaccaaa 4740 
tattccccaa gtcataagca acaattattt ttattaggtt ttggggggtg gagtagtttt 4800 
aataaagtgc acagaatggt gacacccaca aagccttata taaaggcagg attcatgcat 4860 
cctgctgcaa gtacctctgc actaatatac cagatcctaa aatgcatata aggtggacta 4920 
gcatcttaat tctgctagtt gattgtgtct ttactgaaaa ga^cccagct accaatttgc 4980 
ctttttttac accacaaatc ctaattagaa acttgaggtt ttatagaaat catttaatga 5040 
tagagattac atatgtgaat taatgtg^at atagtatctg tgcttcctgt gtctatgact 5100 
attttaagat ataattgtgc tgcgctatca gattaacatt tggaagtttc tagaacagtt 5160 
aatgctattt acagaaagga gtagaaactc atcaactggc actctctttg atttttatat 5220 
tttaaattaa ctctcttcga tctcaaagta tattttacga gtaattittat taggaatctc 5280 
ttatagtgcc ccaatgggat aagctatttg cctattttca qagttctgaa cttggraaaga 53.40. 
agcaaagtat atgtaactaa accacatatt tgtcttttta ttgctttttc ccttctttta 5400 
tatgctaaat caaatataga tfctgtggata gggaagcaat atgtgaatca caatgtagca 5460 
gaggcagacc aagcattaca ttattattta gagctggact gcaccaatta cttgtcctcg 5520 
tgccaaaggc aaatatgttt gcaccttttt ttttttttct gattctcagg ttgattaata 5580 
ctgctaatgc aaatgctcaa gtagatgttt aaaaacttta caaaatagat tcaagtgata 5640 
ctttcbttita aaagbgaaga gttgatgatt acacatagta aattcatgaa ctacagtagg 5700 
tttgtatcaa acaatttttt ttaatgaaaa tctgttgagg tgtacacaat atgctbcttg 5760 
attgtabtag bccbbggbcb cbgcbagacc bcatgagbbb cabcabbbag aaaaggggba 5820 
gaggabgaac baabgbcbcc bbcagabgba aacabgaaab accbagagbb bbacbbgcbb 5880 
bbcaabacac bgaabaabbb baaabgabbc bgacacbgab gbagacccbb bgacbbabaa 5940 
abbcbgagga aacaacbgac agcabaaaab abbbacabbc bbabaacaca gcacagbgac 6000 
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tttcttcttt 
ggatagaaat 
tttatacttt 
cacacagctg 
atgtgaaagt 
atgggatggt 
gcaaabatat: 
cctgtgcttc 
aagt€Laatta 
aataatgctt 
tcacgtcagt 
ctgctaagac 
cggatgaaaa 
caatccctgt 
actttcagtc 
gaaaatctaa 
tacaagtcag 
agaaggtaca 
tatatttaga 
attttaggtc 
tgagaatacc 
atattaagat 
tgaattttta 
tgagggttta 
tggOTttatt 
tgactatgct 
ttaattgctg 
aacttgaatt 
aagcaaattti 
ttcagtcctc 
cccttttccc 
cactgcttat 
agttattttc 
atcccaaabg 



caagattgta 
ggagagattc 
gcattttcat 
atgctcaggt 
gcttaacaca 
tcaattctgg 
atatatatat 
tttaggatgt 
ttatgeiagct 
gttgattttg 
aaaatgttag 
acaaaaatitg 
tttaactgac 
gtacgtgttg 
ttaccctgtt 
ggtggagccc 
ctactctgct 
agtgtattaa 
atgtgcagta 
caattattga 
ataaatgatg 
aaaaatiatgt 
aaaaatggaa 
caaaagctac 
tttgttttat 
ctttttgtga 
gcaaacagct 
ttctgaagcc 
acatctttat 
aaatatagct 
ctgccatatc 
tctttttgac 
ttactgtatt 
gattgcatat 



gctcagagaa 
ctttgtgttg 
tcatcatccc 
tattcccttg 
gtacctggca 
tttaaggaag 
atatttgggg 
agttataact 
agcaaaaatc 
eiataatcctt 
tattaaeiaag 
catggtaagt 
tagaacattt 
ggtatagtta 
tattgtttaa 
actcttctat 
tggtttattt 
tatcactagt 
aacttttttc 
gttgacagtc 
aatagtttat 
acacacatgc 
ttgcaaccac 
taagagaata 
tttaaaattt 
ttgeieiaagtc 
litaagtgcac 
ttttatgtac 
tgtatttcta 
gggaaaceiat 
taattaagca 
tgeiattctgt 
tatcatacct 
tctttataat 



aagatacagg 
tagtagaggc 
ccagaatcat 
tgagaattat 
cacagcactc 
gaagaaaggt 
ttttttttcc 



aaacctgtta 
ttgaggccaa 
taaaaagtgg 
atcagctttt 
ataataggtg 
attcaggagt 
cgacattatc 
gagtgataga 
gctgaagttc 
taggttttgg 
tttgaiggcgc 
tcattttttt 
tactgtgaga 
ttgagaactt 
atgtcacatc 
aatcatatct 
aggtagatga 
atactgctac 
atctaataga 
tttctttgat 
cactaagcaa 
cttgttacaa 
tatgagatag 
ctaactgatt 
ccctgattca 
gttttaatct 
cagtg 



attcaattgg 
attttcctaa 
ggtcaaggtg 
gagaiataaag 
aataaaagtt 
tattatatat 
ctaatattat 
tacttgaaca 
agttgtttct 
accatttgct 
tatggcattg 
gaggaggaeia 
gtaattattt 
cggatttgca 
ctgttgtcet 
accaggcaga 
tacttcacgt 
ttgggtacat 
ttctttttag 
atgagatgac 
ttatactcag 
tctctactgt 
aagagaacat: 
taatgcaaag 
ttttgaaaga 
agctgtatag 
tacacttcca 
ataactttaa 
aacatacttg 
actcagtctc 
ttattacttt 
ctgttttgtt 
gttttcttta 



gggttcaata 6060 
ggagtataga 6120 
taggtcactc 6180 
ctcccaagat 6240 
tggctctatt 6300 
gtaccactaa 6360 
ttgggtgtcc 6420 
tcactaagag 6480 
taacagcttt 6540 
tattttaata 6600 
aagaatgtat 6660 
ggttgtaggc 6720 
tcccttaccc 6780 
aatagacaca 6840 
cttgctggtg 6900 
gcagttttct 6960 
aagcactgtt 7020 
ttgtttttaa 7080 
caaacttgtt 7140 
atatctactg 7200 
tggtgtttat 7260 
ggagttaatg 7320 
tcactcctag 7380 
ggtcatgatt 7440 
attgttttta 7500 
aagctacttt 7560 
ttttttgtta 7620 
cctttaaata 7680 
ctaaagtaac 7740 
cccctcccac 7800 
attgccttta 7860 
tgaaatttaa 7920 
aatgcaateia 7980 
8015 



<210> 24 

<211> 7945 

<212> DMA 

<213> Homo sapiens 



<220> 

<221> ndsc.feature 

<223> Incyte ID No: 7526214CB1 



<400> 24 

agccgggcag ctgcagcgga gccgcggagc 
gagcgcggaa tgtgaggctt ggcgggccgc 
ccctcgcgga cgcccggctg cgcgccgggc 
cctccagctc gccggcggga ccatgaagaa 
cagcggcggc ggagcggcgg gtggcggggc 
ctccggcggc tcgtccgtgg. gggtccgggt 
ggaagagtcg ctggccgaag gtacgggcgc 
tccacagttt tcctcgtgcg tactcacggt 
gtcaataaca tgccagacct caatgtttgt 
tctggtcaca aaaatattgt gggctatttg 
gtatgggaag tccttatctt aatggaatat 
aataagaagc tacagacggg ttttacagaa 
tgbgaagctg ttgcaaggtt gcat cagtg t 
gtagaaaata tttgctgaat gatggtggga 



gggcggcggg gcccaggctg tgcgcttggg 60 
agcacgctcg gacgggccag gggcggcgac 120 
cggggacttg .cccttgcacg ctccctgcgc 180 
gttctctcgg atgcccaagt cggagggcgg 240 
tggcggggcc ggggccgggg ccggctgcgg 300 
gttcgcggtc ggccgccacc aggtcaccct 360 
ccggggaggc tcggacaggc aggt^gattc 420 
ggaatccgat gtgcattgaa gcgaatgfcat 480 
aaaagggaaa ttacaattat gaaagagcta 540 
gactgtgctg ttaattcaat tagtgataat 600 
tgtcgagctg gacaggtagt gaatcaaatg 660 
ccagaagtgt tacagatatt ctgtgatacc 720 
aagactccaa taattcaccg ggatctgaag 780 
actatgtact ttgtgacttt ggcagtgcca 840 
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ctaakaaatt tctteiatcct caaaaagatg 
agtatacaac tctgtcatac agagcccctg 
tcaccaccaa ggctgatatc tgggcactgg 
ctcttccttt tggtgagagt caggttgcta 
attctcgtta ctcccgtaac atacattgct 
aacatagacc tgatatattt caagtgtcat 
gtccagtctc caacatcaat aattcttcta 
ctagtgaagc agctgctagg aaaagccaaa 
caacagaaac ctcaattgca ccaagacaaa 
ctcccagtgt gctgaccatt caaagttcag 
aattcggtaa ccatagacca aaaggggcac 
tgggtcaggg acctcctcag cagccgccac 
agggagattg gagattacag caactccatti 
agcagcagca gcagcaacag caacagcagc 
agcagcagca gcagcagcac caccaccacc 
cagcagtatc aacatggceia cacagcagca 
ttcggtatat caaccacaac cttctgcatc 
gcaggctttc tttcaacagc agatgctagc 
acctgaatat cttacctccc ctcaagagtt 
acttccagct caggttggeia ccataatgga 
tgatsiaagag gccattgcaa atttcacaaa 
gtcagggtgg aatccttttg gagaggataa 
ggacagagaa tttgaccttc taagatcaag 
tataaatcaa gaaaatggca ctgcaaaccc 
taaagatcag cggactggaa agaaaacctc 
tgaatctgaa agtgattttg aatcagatcc 
gcaagatgat gaagaagttc ttcaggggga 
aaccagaaaa tctgggtcat aggcctctcc 
agaaacatag ctctgattct gattatgagc 
tctgtctaca gagacggatc tggcagtggg 
cacctcagcc caattatcct ctgatgttgc 
atttggcgct gtccccttct ttgcagbgcg 
aaagaacctc cctcaacaca ggtttcctgc 
attcacaaag gcgcctttta gcaagaaggt: 
tgaggcacat actatccctg gttatcccaa 
tcagcccttc ctcacabcaa caagtaaaag 
gccctttgat gaaataacgg ggagccagca 
actgtcctct cgccaaaggc gcacaaagca 
tcatggcacg ccaactagca caaagaagac 
ggctcgcagg cacaaaaaag tgggccgccg 
catctcagac tccaaggaga acattagtgt 
tgtcttacaa cctgaggaga gcctgttgga 
agacctgtca tggcaccctc cacatcaggg 
tgtcctgcca gggcggccaa gacaaaattc 
attgaaaatg gatgattttg gtgccgtgcc 
tccacatcag tcccaacagt cccaaccagt 
tccttctaaa cagtagatiac ttctgatgga 
tgtgaacagt tttatgaatt tgaaagaaaa 
aagatcagtc agaataggtg atttctaaat 
agggtagtaa ctcgattcct cttcaggaga 
gggtttctgc tactgacatc acaacacaga 
catggcacct ttctaggtgt gtagccactg 
atcagaatgt catttttatg tgcatatccc 
ttacacttgt gaattttttt taaggtctct 
gttatcttaa agcattagaa agttattatc 
ctttctccac aaagcagagc cagaagtaac 
tttaaaaaca agtgccatta aaatggaata 
atbagctcaa tttaactatt ccacaactta 
atatatataa tttttgtcag ttaaaacaaa 



gagttaatgt agtagaagaa gaaattaaaa 900 
aaatgatcaa cctttatgga gggaaaccca 960 
gatgtctact ctataaactt tgtttcttca 1020 
tctgtgatgg caacttcacc atcccagaca 1080 
taataaggtt catgcttgaa ccagatccgg 1140 
attttgcatt taaatttgcc aaaaaggatt 1200 
ttccttcagc tcttcctgaa ccgatgactg 1260 
taaaagccag aataacagat accattggac 1320 
gaccaaaggc caactctgct actactgcca 1380 
caacacctgt taaagtcctt gctcctggtg 1440 
taagacctgg aaatggccct gaaattttat 1500 
agcagcatag agtactccag caactacagc 1560 
tacagcatcg tcatcctcac cagcagcagc 1620 
agcagcaaca gcaacagcag cagcagcagc 1680 
accaccaccc tacttcaaga tgcttatatg 1740 
acagatgctt ceiacaacaat ttttaatgca 1800 
acagtatcct acaatgatgc cgcagtatca 1860 
tcaacatcag ccgtctcaac aacaggcatc 1920 
ctcaccagcc ttagtttcct acacttcatc 1980 
ctcctcctati agtgccaata ggtcagttgc 2040 
bcagaagaac atcagcaatc cacctgatat 2100 
tttctctaag ttaacagaag aggaactatt 2160 
ttctcctgaa aagaaagctg aacattcatc 2220 
tatcaagaac ggtaaaacaa gtccagcatc 2280 
agtacagggt caagtgcaaa aggggaatga 2340 
cccbtctcct aagagcagtg aagaggaaga 2400 
acaaggagat tttaagtgat gatgatactg 2460 
tcatggattc tgaagatgag gaagaagagg 2520 
aggctaaagg caaagtacag tgacatgagc 2580 
accaacccaa gatctteiata caatactcct 2640 
agtggagact cccaaacagg agtttgatgt 2700 
bgctcaacag ccccagce^g aaaagaatga -2760 
tgcaggactg gagcaggagg aatttgatgt 2820 
gaatgtacaa gaatgccatg cagtggggcc 2880 
aagtgtagat gtatttggct ccactccatt 2940 
tgaaagcaat gaggaccttt ttgggcttgt 3000 
gcaaaaagtc aaacagcgca gcttacagaa 3060 
ggatatgtcc aaaagtaatg ggaagcggca 3120 
tttgaagcct acctatcgca ctccagagag 3180 
agactctcaa agtagcaatg aatttttaac 3240 
tgcactgact gatgggaaag atagggggaa 3300 
ccccttcggt gccaagccct tccattctcc 3360 
cctgagcgac atccgtgctg atcacaatac 3420 
actacatggg tcattccata gtgcagatgt 3480 
ctttacagaa cttgtggtgc aaagcatcac 3540 
cgaattagac ccatttggtg ctgctccatt 3600 
ttctcggcat taactcctgt ttcaaaaaag 3660 
tttggtagct ctbtatagca .ttc.attcJbta-37.2CL. 
aaaccaaata gaagaatgaa gbab'cbcbac 3780 
aaagggagcb aaabbgcaag cbcbaacbaa 3840 
aabgcaagbg bggbacbbcc agbgaaagca 3900 
ageiagggaca gbgaaacbgb babbbbtgab 3960 
baaaabbagg gbbabbbcba cabacacbag 4020 
bbtaatbbcc agacagtbaa aaacaabcba 4080 
tggagagbgc agagabbtca gbccabacac 4140 
tgacbabbgb gcctaaaact cbgbbbcabb 4200 
tcbaabgaba agcababgaa abaabgbgta 4260 
cababbccaa aacaabgbba bacabgabaa 4320 
bbaaaaaaab ggacbabcgb cgcacagaag 4380 
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cctagaacaa aaatatgaag ageuiatatct gacatttgta Etagaaattat eidgaagaaaa 4440 
aaagatacag aacagaaaac attcactact ttagaaacac tttatgcatg gcttcttgcc 4500 
ccaaactttt attgtgatgg ccctaataaa gcagattatt ggaaaaattg gaggacaagg 4560 
gttgtataaa aattttattt tatgaagaaa atatgtagcg gaaactgaat tttcaagaca 4620 
tttacaatgt gaaatcatgt tgcatttaac aatgtacttt attagcaact tcaccaaata 4680 
ttccccaagt cataagcaac aattattttt attaggtttt ggggggtgga gtagttttaa 4740 
taaagtgcac agaatggtga cacccacaaa gccttatata aaggcaggat tcatgcatcc 4800 
tgctgcaagt acctctgcac taatatacca gatcctaaaa tgcatataag gtggactagc 4860 
atcttaattc tgctagttga ttgtgtcttt actgaaaaga acccagctac caatttgcct 4920 
ttttttacac cacaaatcct aattagaaac ttgaggtttt atagaaatca tttaatgata 4980 
gagattacat atgtgaatta atgtgaatat agtatctgtg cttcctgtgt ctatgactat 5040 
tttaagatat aattgtgctg cgctatcaga ttaacatttg gaagtttcta gaacagttaa 5100 
tgctatttac agaaaggagt agaaactcat caactggcac tctctttgat ttttatattt 5160 
taaattaact ctcttcgatc tcaaagtata ttttacgagt aat:tttatta ggaatctctt 5220 
atagtgcccc aatgggataa gctattgcct attttcacag ttctgeiactt ggaaagaagc 5280 
aaagtatatg teiactaaagc acatattgtc tttttattgc tttttccctt cttttatatg 5340 
ctaaatcaat atagatttgt ggatagggaa gcaatatgtg aatcacaatg tagcagaggc 5400 
agaccaagca ttacattatt atttagagct ggactgcacc eiattacttgt cctcgtgcca 5460 
aaggcaaata tigtttgcacc tttttttttt tttctgattc tcaggttgat taatactgct 5520 
aatgcaaatg ctceiagtaga tgtititaaaaa ctttacaaaa tagattcaag tgatactttc 5580 
tttt2iaaagt gaagagttga tgattacaca tagtaaattc atgaactaca gtaggtttgt 5640 
atcaaacaat tttttttaat gaaaatctgt tgaggtgtac acaatatgct tcttgattgt 5700 
attagtcctt ggtctctgct agacctcatg agtttcatca tttagaaaag gggtagagga 5760 
tgaactaatg tctccttcag atgtaaacat gaaataccta gagttttact tgcttttcaa 5820 
tacactgaat aattttaaat gattctgaca ctgatgtaga ccctttgact tataaattct 5880 
gaggaaacaa ctgacagcat aaaatattta cattcttata acacagcaca gtgactttct 5940 
tctttcaaga ttgtagctca gagaaaagat acaggattca attgggggtt caataggata 6000 
gaaatggaga gattcctttg tgttgtagta gaggcatttt cctaaggagt atagatttat 6060 
actttgcatt ttcattcatc atcccccaga atcatggtca aggtgtaggt cactccacac 6120 
agctgatgct caggttattc ccttgtgaga attatgagaa taaagctccc ctagatatgtg 6180 
aaagtgctta acacagtacc tggcacacag cactcaataa aagtttggct ctattatggg 6240 
atggttcaat tctggtttaa ggaaggaaga aaggttatta tatatgtacc actaagcaaa 6300 
tatatatata tatatatatt tggggttttt tttccctaat attatttggg tgtcccctgt 6360 
gcttctttag gatgtagtta taact8iaacc tgttatactt gaacatcact aagagaagta 6420 
aattattatg aagctagcaa eiaatcttgag gccaaagttg bttcttaaca gctttaataa 6480 
tgcttgttga ttttgaataa tcctttaaaa agtggaccat ttgcttattt taatatcacg 6540 
tcagtaaaat gttagtatta aaaagatcag ctttttatgg cattgaagaa tgtatctgct 6600 
aagacacaaa aattgcatgg taagtataat aggtggagga ggaaaggttg taggccggat 6660 
gaaaatttaa ctgactagaa catttattca ggagtgtaat tattttccct taccccaatc 6720 
cctgtgtacg tgttgggtat agttacgaca ttatccggat ttgcaaatag acacaacttt 6780 
cagtcttacc ctgtttattg tttaagagtg atagactgtt gtcctcttgc tggtggaaaa 6840 
tctaaggtgg agcccactct tctatgctga agttcaccag gcagagcagt tttcttacaa 6900 
gtcagctact ctgcttggtt tattttaggt tttggtactt cacgtaagca ctgttagaag 6960 
gtacaagtgt attaatatca ctagttttga ggcgcttggg tacatttgtt tttaatatat 702a 
ttagaatgtg cagtaaactt ttttctcatt tttttttctt tttagcaaac ttgttatttt 7080 
aggtccaatt attgagttga cagtctactg tgagaatgag atgacatatc tactgtgaga 7140 
ataccataaa tgatgaatag tttatttgag aacttttata ctcagtggtg ttttatatat 7200 
taagataaaa atatgtacac acatgcatgt cacatctctc tactgtggag.ttaatgtgaaJ72jSCL 
tttttaaaaa atggaattgc aaccacaatc atatctaaga gaacattcac tcctagtgag 7320 
ggtttacaaa agctactaag agaataaggt agatgataat gcaaagggtc atgatttggg 7380 
gttatttttg ttttatttta aaatttatac tgctactttt gaaagaattg tttttatgac 7440 
tatgctcttt ttgtgattga aaagtcatct aatagaagct gtatagaagc tacttttbaa 7500 
ttgctggcaa acagctttaa gtgcactttc tttgattaca cttccatttt ttgttaaact 7560 
tgaabtttct gaagccbbtt abgtaccact aagcaaataa ctttaaccbb taaataaagc 7620 
aaattbacab cbbbabbgba tbtctacttg bbacaaaaca tacbtgcbaa agbaacbbca 7680 
gtccbcaaab abagcbggga aacaatbabg agabagacbc agbcbccccc bcccaccccb 7740 
btbcccctgc cababcbaab baagcacbaa cbgabbbbab bacbbbabbg ccbbbacacb 7800 
gcbbabbcbb bbbgacbgaa bbcbgbcccb gabbcacbgb bbbgbbbgaa abbbaaagbb 7860 
abbbbcbbac bgbabbbabc abaccbgbbb baabcbgbbb bcbbbaaabg caabaaabcc 7920 
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caaatggatt gcatattctb tataa 

<210> 25 

<211> 3149 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 7526228CB1 

<400> 25 

ctcgcggtat catccggtgc tgaggccctg 
ggtccaagtt gcttcttagc ttactccacc 
caagtcacaa aattctcccc tcccctaccc 
tttcagcccg gaaccggaag tgaagtgggc 
cagagccgga cacggctgtg gccgctgcct 
gactgcgcgg ctccaggctg agggtcggtc 
attgtccggg tggcaccgtt cccggcccca 
ctctgccgcc tcctcctcct cctcgtcttc 
ggtcctcsiac tttgaagaga tcgactacaa 
aggagccttt ggagttgttt gcaaagctaa 
aatagaaagt gaatctgaga ggaaagcgtt 
gaaccatcct aatattgtaa agctttatgg 
ggaatatgct gaagggggct ctttatataa 
ggtgctgaac cattgccata ttatactgct 
tcccaaggag tggcttatct tcacagcatg 
aaaccaccaa acttactgct ggttgcaggg 
acagcctgtg acattcagac acacatgacc 
cctgaagttt ttgaaggtag taattacagt 
attctttggg aagtgataac gcgtcggaaa 
ccgaatcatg tgggctgttc ateiatggtac 
gcccattgag agcctgatga ctcgttgttg 
ggaggaaatt gtgaaaataa tgactcactt 
agactatcta gactccaatg tciagctagat 
ttacagtatc cttgtcagta ttcagatgaa 
tcattcatgg acattgcttc tacaaatacg 
gttcctgcca caaatgatac tattaagcgc 
aagcaacaga gtgaatctgg acgtttaagc 
gagcttgccc ccaacctctg agggcaagag 
taggatcgcc gcaaccacag gcaacggaca 
tgtaactgga acagaacctg gtcaggtgag 
gattactacc tcaggaccaa cctcagaaaa 
tgattccaca gataccaatg gabcagataa 
tcaccaacta cagcctctag caccgtgccc 
acagcattgt aaaatggcac aagaatatat 
acagagaaag caagaactag ttgcagaact 
atctcgcctg gtacaggaac ataaaaagct 
ctaccagcaa tgcaaaaaac aactagaggt 
cacttcatga ttctctggga ccgttacatt 
aggaaaggeia aaccttataa tgacgattca 
gccaactgcc tatatttgct gcattttttt 
catacaattt tactgtttca ttgcataaca 
ctttgcaact tcaaaacaga tgcagtgaac 
aggctagcct aacagaacag gaggtatcaa 
ttttcatatt agaggtggaa cctcaagaat 
ttaataattt ttttcccaaa agatggtata 
tagagtgatt ggtggtatat tacggaaata 
gctttgatgc cagcatcctk ggatcagtac 



7945 



taataaaggt ctcgcgaaat ttgttctaga 60 
ccacccccaa cctgtccctc cttttctttc 120 
cggagtttac ggccctcctc ctgtttccga 180 
ggggcccgtc ggcggaaaac gcagcggagc 240 
ctacccccgc cacggatcgc cgggtagtag 300 
cggaggcggg tgggcgcggg tctcacccgg 360 
ccgggcgccg cgagggatca tgtctacagc 420 
ggccggtgag atgatcgaag ccccttccca 480 
ggagatcgag gtggaagagg ttgttggaag 540 
gtggagagca aaagatgttg ctattaaaca 600 
tattgtagag cttcggcagt tatcccgtgt 660 
agcctgcttg aatccagtgt gtcttgtgat 720 
tgtttgtgcc tttctttcgc agtgctgcat 780 
gcccacgcaa tgagttggtg fcttacagtgt 840 
caacccaaag cgctaattca cagggacctg 900 
gggacagttc taaaaatttg tgattttggt 960 
aataacaagg ggagtgctgc tbggatggca 1020 
gaaaaatgtg acgtcttcag ctggggtatt 1080 
cccttttgat gagattggtg gcccagcttt 1140 
tcgaccacca ctgataaaaa atttacctaa 1200 
gtctaaagat cctitcccagc gcccttcaat 1260 
gatgcggaga aatatttgct gttttcttca 1320 
tcttagtact ttccaggagc agatgagcca 1380 
ggacagagca actctgccac cagtacaggc 1440 
agtaacaaaa gtgacactaa tatggagcaa 1500 
ttagaatcaa aattgttgaa aaatcaggca 1560 
ttggggagcc tcccgtggga gcagtgtgga 1620 
gatgagtgct gacatgtctg aaatagaagc 1680 
gccaagacgt agatccatcc aagacttgac 1740 
cagtaggtca tccagtccca gtgtcagaat 1800 
gccaactcga agtcatccat ggacccctga 1860 
ctccatcGca atggcttatc ttacactgga 1920 
aaactccaaa gaatctatgg cagtgtttga 1980 
gaaagttcaa acagaaattg cattgttatt 2040 
ggaccaggat gaaaaggacc agcaaaatac 2100 
tttagatgaa aacaeieiagcc .tttctactta*21:60. . 
catcagaagt cagcagcaga aacgacaagg 2220 
ttgaaatatg caaagaaaga ctttttttta 2280 
tgagtgttag ctttttggcg tgttctgaat 2340 
cattgtttat tttccttttc tcatggtgga 2400 
tggtagcatc tgtgacttga atgagcagca 2460 
tgtggctgta tatgcatgct cattgtgtga 2520 
actagctgct atgtgcaaac agcgtccatt 2580 
gactttattc ttgtatctca tctcaaaata 2640 
taccaagtta aagacagggt attataaatt 2700 
cggaaccttt agggatagtt ccgtgtaagg 2760 
tgaactcagt tccatccgta aaatatgtaa 2820 
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agataagcaa gatctaagaa gttatcaaaa ctattcttta aaatgctaaa gcagctcctg 2880 
tagccagaga tcacaggtct tccctgtgaa actttggttt ctttctataa atgtgtgtgg 2940 
ttttcagcgc tcaactcctg tcttcaaatg gtagtaagtt ctacttctac ttctgtcatt 3000 
cagaacattt tatgtcaaat gatgtaatgc agaaattctt gtgcatattt gtaactgeiag 3060 
gaagcttttt agatttattt ttgtttttaa taaaattcag attcctattc taaactggta 3120 
cataaaagtg gtgaatgact tgtatcagc 3149 

<210> 26 
<211> 3617 
<212> 

<213> Homo sapiens 
<220> 

<221> inisc^feature 

<223> Incyte ID No: 7526246CB1 

<400> 26 

taagatggcg gacctggagg cggtgctggc cgacgtgagc tacctgatgg ccatggagaa 60 
gagcaaggcc acgccggccg cgcgcgccag caagaagata ctgctgcccg agcccagcat 120 
ccgcagtgtc atgcagaagt acctggagga ccggggcgag gtgacctttg agaagatctt 180 
ttcccagaag ctggggtacc tgctcttccg agacttctgc ctgaaccacc tggaggaggc 240 
caggcccttg gtggaattct atgaggagat caagaagtac gagaagctgg agacggagga 300 
ggagcgtgtg gcccgcagcc gggagatctt cgactcatac atcatgaagg agcfcgctggc 360 
ctgctcgcat cccttctcga agagtgccac tgagcatgtc caaggccacc tggggaagaa 420 
gcaggtgcct ccggatctct tccagccata catcgaagag atttgtcaaa acctccgagg 480 
ggacgtgttc cagaaattca ttgagagcga taagttcaca cggttttgcc agtggaagaa 540 
tgtggagctc aacatccacg tgagtgggct tgggtggggc atggaaagcc acgcaccctg 600 
etgctcctct cccgggagct gggcctgtgg cttggctggg agggggaggt caggggatgt 660 
ctgtccttta gcccccaggg ccgtggctat gggggtcagg gccgggatcc cagcatgggg 720 
aggccggagc aggtaaatat gtggcaagga tggccaggac atgggtatgg ggaccctggc 780 
atggggccag cccctgctgc ccaggtgcct ctgccccagg gctgggcaga ggcagcctgt 840 
ggtgaccgca gctgtcgctg cccctcagct gaccatgaat gacttcagcg tgcatcgcat 900 
cattgggcgc gggggctttg gcgaggtcta tgggtgccgg aaggctgaca caggcaagat 960 
gtacgccatg aagtgcctgg acaaaaagcg catcaagatg aagcaggggg agaccctggc 1020 
cSctgaacgag cgcatcatgc tctcgctcgt cagcactggg gactgcccat tcattgtctg 1080 
catgtcatac gcgttccaca cgccagacaa gctcagcttc * atcctggacc tcatgaacgg 1140 
tggggacctg cactaccacc tctcccagca cggggtcttc tcagaggctg acatgcgctt 1200 
ctatgcggcc gagatcatcc tgggcctgga gcacatgcac aaccgcttcg tggtctaccg 1260 
ggacctgaag gggcacccac gggtacatgg ctccggaggt cctgcagaag ggcgtggcct 1320 
acgacagcag tgccgactgg ttctctctgg ggtgcatgct cttcaagttg ctgcgggggc 1380 
acagcccctt ccggcagcac aagaccaaag acaagcatga gatcgaccgc atgacgctga 1440 
cgatggccgt: ggagctgccc gactccttct cccctgaact acgctccctg ctggaggggt 1500 
tgctgcagag ggatgtcaac cggagattgg gctgcctggg ccgaggggct caggaggtga 1560 
aagagagccc ctttttccgc tccctggact ggcagatggt cttcttgcag aagtaccctc 1620 
ccccgctgat ccccccacga ggggaggtga acgcggccga cgccttcgac attggctcct 1680 
tcgatgagga ggacacaaaa ggaatcaagt tactggacag tgatcaggag ctctaccgca 1740 
acttccccct caccatctcg gagcggtggc c^gcaggaggt ggcagagact gtcfctcgaca 1800 
- ccatcaacgc tgagacagac cggctggagg ctcgcaagaa agccaagaac- aagcagctgg. 1.860. 
gccatgagga agactacgcc ctgggcaagg actgcatcat gcatggctac atgtccaaga 1920 
tgggcaaccc cttcctgacc cagtggcagc ggcggtactt ctacctgttc cccaaccgcc 1980 
tcgagtggcg gggcgagggc gaggccccgc agagcctgct gaccatggag gagatccagt 2040 
cggtggagga gacgcagatc aaggagcgca agtgcctgct cctcaagatc cgcggtggga 2100 
aacagttcat tttgcagtgc gatagcgacc ctgagctggt gcagtggaag aaggagptgc 2160 
gcgacgccta ccgcgaggcc cagcagctgg tgcagcgggt gcccaagatg aagaacaagc 2220 
cgcgctcgcc cgtggtggag ctgagcaagg tgccgctggt ccagcgcggc agtgccaacg 2280 
gcctctgacc cgcccacccg ccttttataa acctctaatt tattttgtcg aatttttatt 2340 
atttgttttc ccgccaagcg gaaaaggttt tattttgtaa ttattgtgat ttcccgtggc 2400 
cccagcctgg cccagctccc ccgggagggg cccgcttgcc tcggctcctg ctgcaccaac 2460 
ccagccgctg cccggcgccc tctgtcctga cttcaggggc tgcccgctcc cagtgtcttc 2520 
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ctgtggggga agagcacagc cctcccgccc 
gtgccaccct gggctctgtg ggctgcactc 
cccccctcac cagggcaggc acagcacagg 
ctcctgctgc agaggggcag gccctgcact 
ggcccgttgt ctccctggcc ctcaaggcct 
ttcaggaaaa gcctctgtgt cactggctgc 
cttggctgag agagtggcat tggcagcagg 
ccccaacccc cagcacccgg gctcagggac 
ctggcctcgc ctggcctgag gtctcgctga 
cgcatgcccc ctcgtgccag tcgcgctgcc 
gctgggttgg cgcaccctcc cctcccgtct 
ttttgaatgt gattttaaag agtgaaaaat 
ctgattcggc tgtctcagac tctttttgta 
ctggggcctg atggggaggg tctcggtggt 
aggtggtctg ctcgggccca ggccatcttc 
ccctcc^acc cctcccagct gacagtcctc 
tcaccttttc ccggggagga gagagcagct 
tgccaacagt gttaggcctg gcacagtgtc 
gaacacccct gcacagc 

<210> 27 

<211> 1955 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7526258CB1 

<400> 27 

agtgcctgcc gggagccacg tctccgaaga 
cggctatcta ggggctgctg ggaagatggc 
<^€r€r?Sr90&&c ccggcccccg ggccccgaga 
ggggagtcat ggtctccatc acccaactcc 
tcccctgctc ggctgtctgg cctgctgctg 
cccagcaeiat ggggggaccg gcctgttgga 
ctgcagcggc ttcbggaaca ggcgaagagc 
aaccccagca aggtgcgcgc ccaccactac 
ttggggtctc ggccacggcc ccctcctgtg 
ctcatcatcc gtaactgccc ctccbbtgac 
gcagtcttac ttggctttcc atctgatggt 
aggctccgcc tccctccgaa gccacctccc 
ccagaggaac tgactcccca cgtgatggtg 
ttgcgggagc cccagcttct ggaagccatt 
ctcagcagca aggtggtaca gaagttggtc 
ctggaacagc agtttatgcc ctgccttgag 
cccctggcta cagtcaacat cttgatgtca 
gccctgcact ttgttttttc ccctggcttc 
ctgattgtgc gtcgctacct ctccctgctg 
taccggggtc cccgccttcc ccgaaggcag 
accgaccgtg cccgctgcaa gtacagtcac 
ctgctggggg aggagaaata ccgccaggac 
gagcaagggg caggcggcag gcccggggag 
ctcccctcca gacttcctgc tgtgcgccag 
ccaggacccc ttcctgccat acccaccaag 
cgccactact cgagaccctg cccagaggta 
cttgcccgcc atgccctgag ccacgtccct 
aacgctggca tttctgccgg gacggccggg 
ggcacctagg cctgatgggc taccagctcc 



cttccccgag ggatgatgcc acaccaagct 2580 
tgtgcccatg ggcactgctg ggtggcccat 2640 
gatccgactt gaattttccc actgcacccc 2700 
gtcctgctcc acagtgttgg cgagaggagg 2760 
cccacagtga ctcgggctcc tgtgccctta 2820 
ctccactccc acttccctga cactgcgggg 2880 
tgctgctacc ctccctgctg tcccctcttg 2940 
cacagcaagg cacctgcagg ttgggccata 3000 
tgctgggctg ggtgccgccg cctcgcccac 3060 
tgtgtggtgt cgcgccttct cccccccggg 3120 
actcattccc cggggcgttt ctttgccgat 3180 
gagactatgc gtttttataa aaaatggtgc 3240 
cctggtgacc ccttttcagc ttctgctggg 3300 
accaggtctc ctccaccgcc atggcttcca 3360 
caggtggggt gaggcagtgg gtcccacttc 3420 
tccacctagt ggctgtccag tgcccattcc 3480 
tctgccactt cccaggtaag caggaggagg 3540 
tgggctgact gggaccgtct caggcccaca 3600 

3617 



ccgatagctg cttcgggatt ggcgtccggg 60 
ggactcggtg gctagccgat gaggaggccg 120 
ccgactgagg gagcgacctg cgcagggccc 180 
atgcttcgag tcctgctctc tgctcagacc 240 
atccctccag tacagccctg ctgtttgggg 300 
ggaggcccca gtigcaggtcc tgtgcaagga 360 
cctggggagc tgctgcgctg gctgggccag 420 
tcggtggcgc ttcgtcgtct gggccagctc 480 
gagcaggtca cactgcagga cttgagtcag 540 
attcacacca tccacgtgtg tctgcacctt 600 
cccctggtgt gtgccctgga acaggagcga 660 
cctttgcagc cccttctccg agaggcaagg 720 
ctcctggccc agcacctggc ccggcaccgg 780 
acccacttcc tggtggttca ggaaacgcaa 840 
ctgccctttg ggcgactgaa ctacctgccc 900 
aggatcctgg ctcgggaagc aggggtggca 960 
ctgtgccaac tgcggtgcct gcccttcaga 1020 
atcaactaca tcagtggcac ccctcatgct_1.0.8jp 
gacacggccg tggagctgga gctcccagga * 1140 
caagtgccca tctttcccca gcctctcatc 1200 
aaggacatag tagctgaggg gttgcgccag 1260 
ctgacbgtgc ctccaggcta ctgcacaggt 1320 
acggagccct ggctaaggcc gccggccctg 1380 
cagctctggt gctgtgcttc ccgtgaggac 1440 
gtcctgccca cagggccagg ctgcctctag 1500 
aaggaggcag ggtgggggag ccctggccac 1560 
ccccctgcag ggtggtgctg gtgttgcggg 1620 
tgctgctggg ctcgagggcc ctgagggagc 1680 
tgccgctacc cttcgaggaa ctggagtccc 1740 
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agagaggcct gccccagctc aagagctacc tgaggcagaa gctccaggcc ctgggcctgc 1800 
gctgggggcc tgaagggggc tgaggggttg atgtggggtt caggatggcc cccccatggg 1860 
gggtggatga tttgcacttt ggttccctgt gttttgattt ctcattaaag ttcctttcct 1920 
tccccgttgt gaatctcagt tttgggacgg ggagc 1955 

<210> 28 

<211> 2937 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_£eature 

<223> Incyte ID No: 7526311CB1 

<400> 28 

gcgcaggggg ccgggctccg gctaggaggg tgggggccgc gccggtgaca gccgatcccc 60 
gcccctgctg cccgccacgt ccctcacgta ccactcggca gaggcgcggg gaeiacctggc 120 
gtactggctg tggcttctct agcgggactc ggcatgaggc tggcgcggct gcttcgcgga 180 
gccgccttgg ccggcccggg cccggggctg cgcgccgccg gcttcagccg cagcttcagc 240 
tcggactcgg gctccagccc ggcgtccgag cgcggcgttc cgggccaggt ggacttctac 300 
gcgcgcttct cgccgtcccc gctctccatg aagcagttcc tggacttcgg atcagtgaat 360 
gcttgtgaaa agacctcatt tatgtttctg cggcaagagt tgcctgtcag actggcaaat 420 
ataatgaaag aaataagtct ccttccagat aatcttctca ggacaccatc cgttcaattg 480 
gtacaaagct ggtatatcca gagtcttcag gagcttcttg attttaagga caaaagtgct 540 
gaggatgcta aagctattta tgaaaggcct agaagaacat ggttgcaggt ctctagttta 600 
tgctgtatgg cctgcaagat gatctttatt gtttggtgga aaaggcaaag gaagtccatc 660 
tcatcgaaeia cacattggaa gcataaatcc aaactgcaat gtacttgaag ttattaaaga 720 
tggctatgaa aatgcbaggc gtctgtgtga tttgtattat attaactctc ccgaactaga 780 
acttgaagaa ctaaatgcaa aatcaccagg acagcceiata caagtggttt atgtiaccatc 840 
ccatctctat cacatggtgt ttgaactttt caagaat^ca atgagagcca ctatggaaca 900 
ccatgccaac agaggtgttt acccccctat tcaagttcat gtcacgctgg gtaatgagga 960 
tttgactgtg aagatgagtg accgaggagg tggcgttcct ttgaggaaaa ttgacagact 1020 
tttcaactac atgtattcaa ctgcaccaag acctcgtgtt gagacctccc gcgcagtgcc 1080 
tctggctggt tttggttatg gattgcccat atcacgtctt tacgcacaat acttccaagg 1140 
agaccbgaag ctgtattccc tagagggtta cgggacagat gcagttatct acattaaggc 1200 
tctgtcaaca gactcaatag aaagactccc agtgtataac aaagctgtct ggaagcatta 1260 
caacaccaac cacgaggctg atgactggtg cgtccccagc agagaaccca aagacatgac 1320 
gacgttccgc agtgcctaga cacacttggg acatcggaaa atccaaatgt ggcttttgta 1380. 
ttaaatttgg aagtgtggcc cagagttgct cagaattgga gcagagcctg agacgtatct 1440 
gcagatcctg tcatcagctg gcaagtccag gagactgtgt catttagaga ctgtgttgtt 1500 
agttatccct caacatcttc taaggtggca ggaaataata ttggaaataa cattttaaag 1560 
taeiaaatbtt aaagtttaaa gaagagtttt gccacttaaa caggggagct ttgtctggaa 1620 
aatacactga gttgaaacac ttcatccttg gaaggattat ataagatgaa cagttgtgat 1680 
aaatgtgtag attagaggga tgtgaatggg cagttagtcc agtgccctca tttaagaggc 1740 
caagatcctg abtcagagga ggcabccbbb gcccagagcb gcbbagcbaa bcbgaccaaa 1800 
tgbbgggaaa aabgbcbcac cbaacccacb abbccbbaab babggabbbb gbgaaaaaca 1860 
abagaacabg bbaabgagba abbbababba gbbcgabgba bbacaabbbb bbagcbbbaa 1920 
abbacagbbb bcbbabaabg bbgaaabgbb bbagaabccb bbgaabcbaa gbabtbgbb.b -1980 . 
ccbeiaabgaa acabbbgbac aacabbbgab gbbbbbacbb abgaaababb cbccbccccc 2040 
aagaaaabbb aaacbbbbbc bcbcbabbba aaagcbaaga aabgbbbbaa aggaaaaabg 2100 
aaabbabcbb ccbbbagcbb abbbbbaagg baaaacagcb bbbbacbcbg bbabbgbggb 2160 
aabggacaga ababbacaba caaaaababb cbgggagagc bbbbbccbag bbggbbbbeia 2220 
abcabbgbgc caccbgaaag gbbbbbagab bbbabaggag cbaabbbgbc caccagcabb 2280 
aabgtaacac agbgbagbta bgaaaababa bbgaaggaca ggaagbggac acgaagbgab 2340 
bbbbgtaacc bgagcagbba abgaabgbgc caacabbbbc baggaaggga cagcaagaab 2400 
abbcbgcbcb gbagbbaaaa bacbggcbgg cbbbbgabgb cbbcabgcbb aabbgbgabc 2460 
acbbbcbbgc acbgbgabgb bbbbacgbga abatgbbgaa gbagaagbcb accababbab 2520 
bbbabaaaab gbbbbcbgba bggcaabaaa cbgaaaacab ggabcaaccc tbcbbbbgaa 2580 
aabaaacbga gbceiabbbag ccbbbbaaaa ababagbcab cbcbbbbaaa bagaabccbc 2640 
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ttccaccatc aaggctcaac attttgtaag catccaaaaa attggtaatt agggggcttg 2700 
cactaaattt cactatcttc agtagagagg aactgtttgg aacttagatt tccaatgtgt 2760 
atattcteiat ggagaaagca agaggtagag tttgtatgtt tgacttacct tagattttta 2820 
ttttccatac atactgcaaa tgattgactt gttgcateiaa tgaagatctt ctgttgtgtg 2880 
cttttcaaac actgtaaata aatttgaaat ttgaataact ttccacagta taactgt 2937 

<210> 29 
<211> 6122 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> iaisc_£eature 

<223> Incyte ID No: 7526315CB1 

<400> 29 

ctgagaggag accfcggtgac acagtagcct ttcggcagag ctccttggga tgagtaggaa 60 
gtgctgctgc aggttttgtc tgggggatat ctgagccatt tctctgtggg cagctgtgtt 120 
tcaaagtctg ggcaggttgt tgttgaattt tgcgtgggct gccaggattt tgtggaagta 180 
taatactttg tcattatgag atgtcgtctc tcggtgcctc ctttgtgceia attaaatttg 240 
atgacttgca gttttttgaa aactgcggtg gaggaagttt tgggagtgtt tatcgagcca 300 
aatggatatc acaggacaag gaggtggctg taaagaagct cctcaaaata gagaaagagg 360 
cagaaatact cagtgtcctc agbcacagaa acaticatcca gttttatgga gtaattcttg 420 
aacctcccaa ctatggcatt gtcacagaat atgcttctct gggatcactc tatgattaca 480 
ttaacagtaa cagaagtgag gagatggata tggatcacat tatgacctgg gccactgatg 540 
tagccaaagg aatgcattat ttacatatgg aggctcctgt caaggtgatt cacagagacc 600 
tcaagtcaag aaacgttgtt atagctgctg atggagtatt gaagatctgt gactttggtg 660 
cctctcggct ccataaccat acaacacaca tgtccttggt tggaactttc ccatggatgg 720 
ctccagaagt tatccagagt ctccctgtgt cagaaacttg tgacacatat tcctatggtig 780 
tggttctctg ggagatgcta acaagggagg tcccctttaa aggtttggaa ggattacaag 840 
tagcttggct tgtagtggaa aaaaacgaga ggctaaagaa actagagcgt gatctcagct 900 
ttaaggagca ggagcttau gaacgagaaa gacgtttaaa gatgtgggag caaaagctga 960 
cagagcagtc caacaccccg cttctcttgc ctcttgttgc aagaatgbcb gaggagtctt 1020 
actttgaatc taaaacagag gagtcaaaca gtgcagagat gtcatgtcag atcacagcaa 1080 
caagtaacgg ggagggccat ggcatgaacc caagtctgca ggccatgatg ctgatgggct 1140 
ttffgggatat cttctcaatg aacaaagcag gagctgtgat gcattctggg atgcagataa 1200 
acatgcaagc caagcagaat tcttccaaaa ccacatctaa gagaaggggg aagaaagtca. 1260 
acatggctct ggggttcagt gattttgact tgtcagaagg tgacgatgat gatgatgatg 1320 
acggtgagga ggaggataat gacatggata atagtgaatg aaagcagaaa gcaaagtaat 1380 
aaaatcacaa atgtttggaa aacacaaaag taacttgtbt abcbcagbcb gbacaaaaac 1440 
agtaaggagg cagaaagcca agcacbgcat ttbbaggcca atcacabbba catgaccgba 1500 
abbbctbabc aatbcbacbb tbabbbbbgc bbacagaaaa acggggggag aattaagcca 1560 
aagaagbaca bbbatgaatc agcaaabgbg gtgcctgatb abagaaabbb gbgatccbab 1620 
abacaataba ggacbbbbaa agbbgtgaca bbcbggcbbb bbcbbbtaab gaabacbbbb 1680 
bagbbbgbab bbgacbbbab bbccbbbabb caaabcabbb bbaaaaacbb acabbbbgaa 1740 
caaacacbcb baacbccbaa bbgbbcbbbg acacgbagba abbcbgbgac abacbbbbbb 1800 
bbbcbbabag caabacacbg baababcaga aabggbbggc cbgagcaacc bagbaagacc 1860 

bcgbcbcbac baabaabtaa aaaacbagcb. ggcabggbag cacacaccjtg bagbcccaga^92p.. » 

bacbbgggag gccaaggcag gaggabbgcb bgagaccbag caabcagbca gggcbgcagb 1980 
gagccabgat ggcaccacbg cacbcbagcc tgggcaagag aacaagabcc bgbcbcaaaa 2040 
aacaaaaa a a agaaagetabb gabagbacaa aabccaacaa caabacbgag abgabcbaag 2100 
aaggbbabaa caaaabgcbc bbcagaaaba ccbaagbgcb gagaabbbbb agbacbaaag 2160 
agcacagcbg cbcaaagbaa agccbgagca gbgbbcbcag baabgbabbb gaaggaaaaa 2220 
bacccbgabb bgaaaccciac agcagabgbb gcaaacbbbc abaccacbgc bggccabgga 2280 
agccbcbbaa caacacacbg bcabbbaagg cbgbgcbbgb gcbbbabaca aagagaaaga 2340 
ggbggbcbba aggggabgcb bccagggggb gagbbcabgc cbcbccbgba bbbbccagca 2400 
agbggggbab gbgbggbggb bgbbbbbbag aggggcabaa baabccagga bbcbaagcab 2460 
abgcbcagcb abbbbaaaga ggaaabbaaa babbabaaaa gaaabagbaa agabaagbba 2520 
bccbcacbba ggcaaaagca caggbccbbb ccababcaag bbbagccbac cagggbbgbb 2580 
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ttttgtttta accctgctta ataatgttgg tgttttagaa gtagatacag gcactgctct 2640 
gaaaacctgg ctagccsutgg atattctcag aatgttatca cctgtttgtc eiaagcttgtb 2700 
taaattataa ciacactttta attatatata tgaggcaaaa gaactaagac ttttttcaaa 2760 
ctaaattaga aaggagtgtc attatttgac tgttaaacca aaatattttt ggtgggtctt 2820 
tttatggaag tttaaagaaa ggacatcatc atagatatga tctaacagta tttctaacta 2880 
tattbgatca ttaaaagcct cttggaattt gaagcgtgac gtgtttctaa tgccccttga 2940 
gaggtgaaaa ataccacata atgatcagta tgctgtgcca gcttcatttg gggagaaata 3000 
actagtagaa agttctgggt gtgaggtgta cagcagtcta ggtggcatag tgatgaagaa 3060 
agggatcaga gtctgactgt cactcagaat cctgggctca gttgcttgac aaccttggga 3120 
aaattgttfct atctttgtgc gtctgtttgc tgatcttcag cgtgggaata ataacagtac 3180 
ctacttgaaa ggatcattgt gcggattaaa agaaataata tatgtaaagc actttaacac 3240 
agcaccaggc ccacggaaag tggctaatgt tagctactat gaatggtgcc agtgaagaca 3300 
ctgaaaaata agtgatttca gtaaccttct ggaaagctat cagtttcaaa tetatattttc 3360 
tcbgtagtat gagatgeiaat taaaagtgga tagctttcag gctaagateiaa gagaacatgc 3420 
ttagaatgta agctaaacag attttttctg ttgctctttg aaaactatga gccctggcca 3480 
gct.taacctg gtctgaggtg agactaaaca caaaaacagt agataaatct cliccctaaaa 3540 
gatggattcc cccacatacc catgctacta gtttctctgt ctattcacac atatgtacaa 3600 
atiacatgaac acagcctgtc tgtgctcaga catagagaag tactacctga ctitgagtcaa 3660 
tgcacccaag aagaaaagct tggagtagag cagaagggag ggcttgggac tcctgtcttt 3720 
ccagcatgcc ctggggtgca gtggtcagcc acctgaagag agagccaata gccatggggt 3780 
ttacaaggca aagatagtca ttcattcaaa cacatattca tagaagctcc ttctctgtgc 3840 
cagacaactg ttctggaaga tagctagatg aaaatctttg cactcacagg agcttaacat 3900 
gccagtgagt gaagatcgat gataaataaa gcaaatgcat catatgttca catttgataa 3960 
gtatatgcca aaaaatgaag ccgggaagga ggacaaggcc catgggtggg tgttgaggtt 4020 
tttaaagtgt ggtcaggaaa ggccccactg ataaggtaac atttgagcaa gtctgaaaaa 4080 
ggcaagggga tctttggggc taacttcggg atccctgcac tttatgtaag aatgtaaacc 4140 
tggagtctca tttaagaatg atcagcaata cgtttagaac atatgaactg aatgaaatgg 4200 
acattttttc ttaatttatg tataaatcca tatgattata cataaagttc tgatgcatta 42 60 
ataaaagcag ccaaataggg ccaaagagaa aaataacagg actct'gtact ggacctaact 4320 
ttatcattaa ttaggtaata ttttcctcat ttctttactg ctgccatttt: cctcaccagt 4380 
attccagaga tggtcatagc tcattactct accaccaaga acctaaaagg aabtagaata 4440 
cagcagaatt ggcctcagtg aagagctteia aattgttctc ctcgtagaac tggactattg 4500 
atcattacca cgtgacgttg gctctattac tttctgttcc caatgtcctt ctagtggttt 4560 
gaaaatgtta aaacatccaa aasiaaaacaa cccggtagca ttgticccttc cccactgaca 4620 
aacttatcaa atccagaagc tttagagttt cgtctctiaat tatttttctc ctgaacaaaa 4680 
ttacccaagt caaaacaaaa tgtattttta gaattacggc agcatacgac ctgaattttg 4740 
tgagtttcgt ggctttatct taaatcacca tttccctaaa aatggtttct ttctccttag 4800 
eiaatgctggt ggcaacttga tgaaacagcc aaatgcacca gggcaggtca ctttcccatt 4860 
acactgattc cacaattaaa aaaataaaaa aaagaaaaaa aactcattga gatagctaca 4920 
gttctatagg ttaatttaaa gcctcctttt tctactcatt tttgaaagca aaattacatt 4980 
ttactatttt acataaccag tgaaaagacg ttgaaagcct acagctcact gttttgggtg 5040 
ctctggaaat gttgagggtg ggtttttaac cagtgatttt taacgtgcag tgaatttgtt 5100 
agacttttaa acaccagcta aggtagtcaa acttgatccc cattaaaaat caaggaatta 5160 
ggggtcgggg gagggtttag gagtgatcca gaatgacctc ccagaattac tgtgcgtaca 5220 
actttatttt tcagagtttt cattgggaat ggtaagagtg tttatgaaag acagttttaa 5280 
aacttattct gagttaaata ttaatacttt aaaaaattat tgtactagac ttatcgcagc 5340 
cttttgaaag tagcagagtt tcatcatacc acatatataa cagagcataa attttctata 5400 
atcaggcacc ttttgctgct tttgagtaag actgttttcc tgtttaagtg ttaagcatcg 3i3Q 
ccagacataa aaatctattc tctcctcticg attgtagcat agcctgacag ctctagatac 552b 
agcatttcta tgatgaaaaa tgagtatcca tcaggeiaatc tagaagacta gccgtgtttt 5580 
ctcagactcc acctttgttt gcactctgtt gcctgtgagg agctttctgg catgtgatta 5640 
tttacttcaa aactagagtt ccaagcacct acattaatta ttttatattg tgtgcagaat 5700 
agtatatctt ttaatgtcag atatgataca ctgcacatat tgcttttgca ctcttaaaat 5760 
ttbtgtacta aataatagaa aatatttata tbctttgagt gbgagctttg aabagabggc 5820 
abbabcacbb batbgbtbtt bbaacaaaaa cbbbbbctca abtabtcbab bgcaabgbba 5880 
bbctgagcaa gtccbatgcc aaa"tabcbbg babaabgbbb gtabggaaga bbaaabbbba 5940 
cbcbbgbgbg gbaagacbab bbcagtbacb gabbbbatag bbggaabbbg ababbccagc 6000 
acaaagbcca cagbgbabbc agaaabccaa gbbggbgbca bacabbbcab bbbgabgbga 6060 
acbbbbcbbb gcbbbccbbb gbbcbaagac bccabbbbgc aabaaacgbb bbgacagcaa 6120 
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<210> 30 
<211> 1914 
<212> D3^ 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 7526442CB1 

<400> 30 

gctgtcatcg ttccgtgggc cctgctgcgg gcacgctctc ggcgcatgcg ttttttatgc 60 
gggattaagc ttgctgctgc gtgacagcgg agggctagga aaaggcgcag tggggcccgg 120 
agctgtcacc cctgactcga cgcagcttcc gttctcctgg tgacgtcgcc tacaggaacc 180 
gccccagtgg tcagctgccg cgctgttgct aggcaacagc gtgcgagctc agatcagcgt 240 
ggggtggagg agaagtggag tttggaagtt caggggcaca ggggcacagg cccacgactg 300 
cagcgggatg gaccagtact gcatcctggg ccgcatcggg gagggcgccc acggcatcgt 360 
cttcaaggcc aagcacgtgg agactggcga gatagttgcc ctcaagaagg tggccctaag 420 
gcggttggaa gacggcttcc ctaaccaggc cctgcgggag attaaggctc tgcaggagat 480 
ggaggacaat cagtatgtgg tacaactgaa ggctgtgttc ccacacggtg gaggctttgt 540 
gctggccfctt gagttcatgc tgtcggatct ggccgaggtg gtgcgccatg cccagaggcc 600 
actagcccag gcacaggtca agagctacct gcagatgctg ctcaagggtg tcgccttctg 660 
ccatgccaac aacattgtac atcgggacct gcccccaagg cccatccagg gcccccccac 720 
atccatgact tccacgtgga ccggcctctt gaggagtcgc tgttgaaccc agagctgatt 780 
cggcccttca tcctggaggg gtgagaagtt ggccctggtc ccgtctgcct gctcctcagg 840 
accactcagt ccacctgttc ctctgccacc tgcctggctt caccctccaa ggcctcccca 900 
tggccacagt gggcccacac cacaccctgc cccttagccc ttgcgagggt tggtctcgag 960 
gcagaggtca tgttcccagc caagagtatg agaacatcca gtcgagcaga ggagattcat 1020 
ggcctgtgct cggtgagcct taccttctgt gtgctactga cgtacccatc aggacagtga 1080 
gctctgctgc cagtcaaggc ctgcatatgc agaatgacga tgcctgcctt ggtgctgctt 1140 
ccccgagtgc tgcctcctgg tcaaggagaa gtgcagagag taaggtgtcc ttatgttgga 1200 
aactcaagtg gaaggaagat ttggtttggt tttattctca gagccattaa acactagttc 1260 
agtatgtgag atatagattc taaaaacctc aggtggctct gccttatgtc tgttcctcct 1320 
tcatttctct caagggaaat ggctaaggtg gcattgtctc atggctctcg tttttggggt 1380 
catggggagg gtagcaccag gcatagccac ttttgccctg agggactcct gtgtacttca 1440 
catcactgag cactcattta gaagtgaggg agacagaagt ctaggcccag ggatggctcc 1500 
agttggggat ccagcaggag accctctgca catgaggctg gtttaccaac atctactccc 1560 
tcaggatgag cgtgagccag aagcagctgt gtatttaagg aaacaagcgt tcctggaatt 1620 
aatttataaa tttaataaat cccaatataa tcccagctag tgctttttcc ttattataat 1680 
ttgataaggt gattataaaa gatacatgga aggaagtgga accagatgca gaagaggaaa 1740 
tgatggaagg acttatggta tcagatacca atatttaaaa gtttgtataa taataaagag 1800 
tatgattgtg gttcaaggat aaaaacagac tagagaaact tattcttagc catcctttat 1860 
ttttatttta tttatttttt gatggagtct tgcactccag cctggtgaca gact 1914 
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oenne/ 1 nreomne proiem Kinases, cauuy w; uunuun. 1 1 ^ i 
KINASE PROTEIN DOMAIN TRANSFERASE PD00584: L27-G36 


PROTEIN KINASE DOMAIN 
DM00004lA53714|17-262: L27.V151 
DM00004|I49376|270-509: K26-G153 
DM00004|P08458|20-262: 130-V151 
DM00004IP38692124-266: K26-V151 


Potential Phosnhorvlation Sites: S34. S75, S106, S137, T25, T46 


Potential Glycosylation Sites: N44 


Protein kinases ATP-bindine region signature: D0-K53 


Protein kinase domain: F46-G305 

Serine/Threonine protein kinases, catalytic domain: F46-G305 


Eukaryotic protein kinase IFB000719: H189-L204 
Protein kinases signatures and profile: T173-P230 


CASEIN KINASE I GAMMA ISOFORM CKIOAMMA TRANSFERASE 
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PROTEIN KINASE DOMAIN 
DM00004|B56711|48-303:V48-L76E109-R302 
DM00004|A5671 1|46-303:V48-L76 E109-R302 
DM00004|C56711|45-301:V48-L76 E109-R302 
DM00004lD56406l3I-276:V48-L76E109-R302 


Potential Phosphorylation Sites: S19, S99. S129. S262. T84, T183. T210, T232, T247 


Protein kinases ATP-bindine region signature: I52-K75 


Serine/Threonine protein kinases active-site signature: L193-V205 


Signal cleavage: M1-G68 


Protein kinase domain: V46-F310 


Serine/Threonine protein kinases, catalytic domain: V46-K3 13 


Eukaryotic protein kinase IPB000719: C168-L183. 1239-G249 
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Hexokinase family IPB001312: S10-G24 


Potential Phosphorylation Sites; S5. S56, S67, T52 


Hexokinase family IPB001312: S10-G24 


»— ♦ 

1 
% 

3 


PROTEIN KINASE DOMAIN 
DM00004|D8044|100-349: V38-A117 
DM00004|P08630|329-573: E35-N1I4 
DM00004IQ08881I361-604: E35-L1 12 • 


Potential Phosphorylation Sites: S14, S67, S69 


Leucine zipper pattern: LI 12-L133 


Protein kinases ATP-binding region signature: V42-K63 1 


Regulator of 0 protein signaling domain: T54-C175 1 


Regulator of G protein signalling domain: T54-C175 


GPCR kinase signature PR00717: FI71-N183 


Regulator of G protein signalling domain proteins PF0061S: MIS-K21. F162-K178 


RECEPTOR KINASE TRANSFERASE SERINEmiREONINEPROTEIN ATPBINDING 
BETAADRENERGIC COUPLED PROTEIN MULTIGENE FAMILY PD007430: M1-VS3 


KINASE; THREONINE; ATP; SERINE; 
DM01747IP211461 152-191: E152-S187 


N-TERMINAL DOMAIN 
DM05135|P21 146|33-150: U3-E151 
DM05135|P32865|33-150: L34-E151 
DM05135IO09639I34-149: L34-I150 


Potential Phosphorylation Sites: S29, S38, S60, S127, S168, T97 1 


Cell attachment sequence: R158-D160 


CELL CYCLE PROGRESSION PROTEIN FAST KINASE PD041692: L200.P417 I 


FAST KINASE PDi35789: M1.R201 


Potential Phosphorylation Sites: S94, S246, S332, S373, S441, T138. T336, T365 1 


Signal Peptide: M1.G18,MI-A21 ' 
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Polypeptide 




7526214CD1 




7526228CD1 










7526246CD1 


















7526258CD1 1 
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o 










«^ 
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S2 






cn 



CO 




VO 



PF-1724P 



Table 4 



Polynucleotide 
SEQ ID NO:/ 
Incyte ID/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 

• 


5* Position 


3' Position 


10/ 73251 oDCaI/ 
4430 


2471*4430, 
1476-1534, 1- 
754, 1679-1715 


FL 1002225_3 o2 


1 


4430 






95083778 J 1 


313 


1222 






GBI.NT_009952 J)14. 10.edit2 


358 


592 






GBLNT_009952 J014. 10.edit 1 


649 


4430 






72678960V 1 


938 


1574 






72678288V 1 


1084 


1754 






g9777972 


1095 


1765 






72682030V1 


1185 


1978 






gl4503665 


1198 


1884 






gl 1642692 


1204 


1880 






73197364D1 


1208 


1859 






gl48 10994 


1214 


2029 






726808 14V 1 


1223 


2017 






73197252V1 


1255 


1917 






g24471308 


1255 


2004 






73199082D1 


1259 


.1957 






73197393D1 


1282 


2255 






73196694D1 , 


1298 


1950 






gl2769183 


1321 


2017 






g23286620 


1335 


2004 






g;23286086 


1352 


2002 






g29389943 


1354 


2004 






g21980207 


1383 


2002 






gl2763752 


1396 


2064 






gl3531552 


1469 


2171 






g3 1267289 


1503 


2297 






gl3341861 


1506 


2183 






gl 1643902 


1515 


2243 






g3 1271373 


1528 


2457 






g3 1069857 


1546 


2439 






g30307375 


1560 


2428 






g30307376 


1560 


2458 






gl 3534533 


1582 


2282 






8568096T1 (KIDNFECOl) 


1582 - 


-2429-. -r 






gl91 19842 


1587 


2305 






gl6200364 


1599 


2338 






gl4ol425o 


1 /me 


23S0 






g30463287 


1625 


2433 






8567187T1 (KIDNFECOl) 


1638 


2413 






g30758954 


1640 


2392 






g31295054 


1666 


2445 






71382643V1 


1693 


2395 






g24809844 


1716 


2469 






g24806937 


1718 


2467 






g2939ill7 


1720 


2469 



I 



PF-1724P 



Table 4 



Polynucleotide 
SEQIDN07 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5' Position 


3* Position 






8557426T1 {LUNGNOT30) 


1724 


2451 






8628262H1 (UTREDMF02) 


1724 


2463 






8628262J1 (UTRBDMF02) 


1724 


2467 






gl2760412 


1772 


2468 






gl074S442 


1775 


2468 






gl4294389 


1782 


2468 






g24794954 


1782 


2469 . 






7623349H1 (HEARFEE03) 


1783 


2431 






£23288713 


1786 


2468 






g23295470 


1801 


2467 






8215062H1 OFIBRTXCOl) 


1809 


2468 






g23293825 


1816 


2469 






g248107S5 


1829 


2469 






g21981399 


[1833 


2469 






g3 1148999 


1852 


2469 






gl9751033 


1858 


2469 






g22697148 


2196 


2847 






gl2877899 


2386 


3249 






gl 1261005 


3402 


4093 






4289337F6 (BRABDIROl) 


3471 


4274 






g24795218 


3698 


4426 


17/7526192CBI/ 
3276 


1999-3276,910- 
1003, 1-224, 
1S46-1612 


GBLNTjOl 1255_00L 13.editl 


1 


3276 






gl4077475 


734 


1407 






6981630H1 (BRAIFER05) 


1223 


1711 






6306286F7 (NERDTDN03) 


1860 


2512 






6306286F8 (NERDTDN03) 


1860 


2524 






6306286T6 (NERDTDN03) 


1925 


2468 






55i39024Hl 


2447 


3057 






55139140J1 


2552 


3009 


18/7S26193CB1/ 
3910 


3709-3732, 
2091-3219, 1- 
823, 3788-3910 


7217965H1 (COLNTMCX)!) 


1 


344 






7266654H2 G^OSEDICOl) 


20 


367 






OBL938794.82 


20 


3910- 






gl9373Q27 


99 


612 






g6992730 


340 


775 






3ol7328Fo (EPIPNOTOl) 


464 


775 






55139719H1 


486 


797 






132879 IHl (PANCNOT07) 


500 


746 






g9707306 


520 


779 






g4740126 


520 


779 






g9705742 


520 


779 






g4987918 


520 


779 






g5637149 


520 


779 






30233S2H1 (PROSDINOl) 


537 


799 



2 



PF-1724P 



Table 4 



Polynucleotide 
SEQIDNO:/ 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5* Position 


3* Position 






1679654H1 (STOMFETOl) 


552 


768 






g5 8498 14 


560 


779 






6844123H1 (KIDNTMN03) 


567 


790. 






55099275JI 


582 


1094 






80I8416F6 (BMARTXEOl) 


587 


1213 






3574386H1 (BRONNOTOl) 


610 


895 






72717764V1 


616 


1173 






72719264V1 


634 


1307 






5S099283J1 


698 


1094 






gl0091676 


706 


1162 






55139943H1 


776 


1097 






55139835J1 


778 


1097 






SS139827J1 


797 


1101 






55139819J1 


799 


1093 






gl3410426 


805 


1348 






g9331447 


824 


1456 






gl0734144 


824 


1504 






g9336431 


825 


1466 






gl0143324 


830 


1503 






g7930221 


834 


1412 






gli316872 


840 


1124 






g7254240 


840 


1426 






55139803J1 


849 


1097 






55 13981111 


868 


1096 






g6868467 


885 


1348 






g20968062 


1048 


1744 






g20856405 


1115 


1714 






gl42S4976 


1124 


1734 






gl4254972 


1127 


1742 






g20967828 


1146 


1711 






g8039829 


1217 


1481 






g8039887 


1222 


1524 






gl4404428 


1246 


1525 






727535 1H2 (LIVRUNEOl) 


1249 


1757 






gl4404429 


1251 


1509 






g21012570 


1295- - - 


1751. 






gl4451882 


1321 


1832 






gl4453118 


1334 


1889 






/loot ni'iiJi /"DTJ A"Dr^rDAi\ 


1376 


1647 






72337884V 1 


1376 


1670 






72337654V1 


1376 


1768 






72336903V1 


1376 


1768 






72338409V1 


1376 


1768 






72337087V1 


1376 


1768 






72338101V1 


1376 


1768 






72337058V 1 


1376 


1768 






72338322V1 


1376 


1768 



3 



PF-1724P 



Table 4 



Polynucleotide 
SEQIDNO:/ 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5* Position 


3' Position 






72338656V1 


1376 


1768 






72337641V1 , 


1376 


1768 






72338238V1 


1376 


1768 






72337228V 1 


1376 


1768 






72338353VJ 


1376 


1768 






72766921V1 


1376 


1768 






72338557V1 


1376 


1768 






72338434V1 


1376 


1768 






72336974V1 


1376 


1768 






72338470V1 


1376 


1768 






72338136V1 


1376 


1768 






72338013V1 


1376 


1768 






4291033F6 (BRABDIROi) 


1376 


1768 






72338336V1 


1376 


1768 






72337535V1 


1376 


1768 






72338790V1 


1376 


1768 






72338126V1 


1376 


1768 






72338444V1 


1376 


1768 






72338450V1 


1376 


1768 






72337857V1 


1376 


1768 






72337183V1 


1387 


1768 






g21012239 


1466 


1742 






£12199215 


1470 


1829 






gl4404390 


1491 


2092 






g281 18293 


1497 


1880 






7993480H1 (UTRSDICOl) 


1502 


1820 






8018416R6 (BMARTXEOl) 


1520 


2235 






7067749H1 (BRATNOROl) 


1599 


2231 






6345837H1 (LUNGDIS03) 


1961 


1989 






6345837H1 (LUNGDIS03) 


1992 


2298 






6038785H1 (PITUNOT06) 


2014 


2654 






g2354017 


2076 


2336 






836023611 (MIXDUNN06) 


2092 


2727 






gl2371898 


2135 


2425 






gl2361664 


2135 


2448 






5781301F6 (BRAXNOT03) 


2147 - 


2580 - 






gl2361674 


2153 


2443 






56057236H1 


2226 


2705 






gl2370692 


2476 


2624 






gl0460662 


2508 


2889 






5497716F6 (BRABDIROI) 


2511 


2969 






gl2233185 


2530 


2930 






gl4345747 


2807 


3117 






6327887H1 (BRANDINOl) 


2862 


3393 






5497716R6 (BRABDIROI) 


2870 


3085 






624S571H1 (TESTNOT17) 


3117 


3543 






5772648H1 (BRA1NOT20) 


3343 


3863 



4 



VP-n2AV 



table 4 



Polynucleotide S 
SEQIDNa/ I 
Incyte ID/ 
Sequence Length 


Selected 2 
fragments 


Sequence Fragments 5 


' Position 3 


' Position 






S2237352 




)/oy 






SI790314H1 (FIBRTXSOT) 


>569 


JoU/ 






5786172H1 (FIBRTXS07) 


3569 


JoU/ 






5786527H1 (FIBRTXS07) ^ 


3569 




19/7526196CB1/^ 
4380 


464^63,2075- < 
2164, 1-349, 
3320-4380 


GBI.g29789976.editl 


1 ' 


1380 






B9772401 


41 


583 






9505159U1 


763 


lo41 






9524857U1 


818 


1676 






9649412U2 


884 


1835 






9505172U1 


887 


1743/ 






95249S6U1 


887 


1834 






9649412U1 , 


887 


1835 






9611509U1 


897 * 


1740 






9611509U3 


900 


1709 






9509648U1 


954 


1835 






7754868J1 (SPLNTUEOl) 


1032 


1590 






9580055U3 


1035 


1895 






9600429U1 


1036 


1729 






9600055U1 


1036 


1835 






9600429U3 


1039 


1835 






958005SU1 


1065 


1676 






55095641J1 


1141 


1738 






72484222D1 


1169 


1738 






72481795D1 


1176 


1738 






72616020V1 


1188 


1738 






72484068DI 


1189 


1737 






8516684H1 {HNT2TXF01) 


1198 


1920 






8757725H1 (TLYJTXNOl) 


1198 


2052 






72481336D1 


1202 


1738 






6831090J1 (SINTNOROl) 


1304 


1944 






7719693H1 {SlNTFHHa2) 


1353 


2006 






6819080J1 (BRAUNOROl) 


1354 


1879 






8016820J1 OSMARTXEOl) 


1354 


1985 






8021257JI (BMARTXEOl) 


1354 


2016 - = — 






8016427J1 (BMARTXEOl) 


1354 


2051 






7050749F6 (BRACNOK02) 


1354 


2086 






7oUUol!^JI vcaUOIMnUI^ 


i 


2011 






55144872J1 


1411 


2084 






55144879J1 


1411 


2094 






gl 1295923 


1498 


2168 






9021791511 


2279 


2898 






90218047J1 


2279 


2901 






90217923J1 


2279 


2918 






90218039J1 


2279 


2945 






9O217907J1 


2279 


2987 



5 



PF-1724P 



Table 4 



Polynucleotide 
SEQ ID NO:A 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 

• 


S* Position 


3* Position 






9021793 IJl 


2279 


3041 






902 17947 J 1 


2279 


3091 






90218031J1 


2279 


3111 






902I8023J1 


2279 


3151 






9679740U1 


2285 


3136 






95103533H1 


2361 


2898 






7208870H1 (FIBPFEAOl) 


2365 


3037 






7050749R6 (BRACNOK02) 


2368 


3037 






822846R1 (KERANOT02) 


2392 


2942 






72S3472H1 (BRAMNOAOl) 


2398 


3036 






7703764H1 CUTRETUEOl) 


2406 


2999 






gl2097556 


2422 


3076 






7050236H1 (BRACNOK02) 


2429 


3046 






7115279H1 (BRAENOKOl) 


2430 


3037 






gl9212752 


2447 


3014 






7685360H1 (BRABDIKQ2) 


2500 


3037 




• 


6946127F6 ffTUBTUROl) 


2538 


3118 






9O218039J1 


3013 


3077 






g30442850 


3108 


4157 


2Q/7526198CB1/ 
4293 


1-48, 3480- 
3610, 1319- 
1887 


73232879V 1 


1 


622 






73232879D1 


1 


624 






GBI.g29789976.editl 


1 


4163 






gl4002261 


3 


677 






gl0937540 


49 


748 






gl3997818 


76 


607 






86841 17H1 (BRAIUNFOl) 


83 


981 






g22275488 


86 


731 






g22660086 


86 


826 






8042207H1 (OVARTUEOl) 


101 


650 






7751875H1 (HEAONOEOl) 


102 


660 






7441147H1 (ADRETUE02) 


102 


693 






g302 16686 


138 


660 






9713909U2 


138 


847 






90049479 J 1 


139 . 


730- - 






90049355F6 


139 


807 






90049387J1 


139 


833 








139 


847 






90049363J1 


139 


848 






90049379H1 


139 


914 






90049495J1 


139 


926 






90049355H1 


139 


1002 






60215662U1 


194 


827 






g2 1012603 


299 


882 






g2 10 12604 


318 


882 






9679740U1 


2082 


2933 



6 



PF.1724P 



Table 4 



Polynucleotide 


Selected 1 


Sequence Fragments i 


> Position 


> JrOSlUOn 


SEQ ID NO:/ ] 


Fragments 








Incyte ID/ 










Sequence Length 
















7n09 








>UZloU47Jl 


'>AOO ' 






















77A7 


\ 














9UZ17931JX 




Zo^O 


i 




90217947J1 


ZU92 


Zooo 






OAI 1 Q A'3 1 T 1 

9U21oU31Jl 










9021 8023 J 1 


OAAO 

2U92 


zy4o 






95103533H1 


O 1 CO 


zoyj 






720o870Hl (FIBPFEAOr) 


ZlOZ 


Zo34 






7050749KO (BKACNUKU2) 


01 AC 

210D 


Zo^4 






822846R1 (K£RANCjr02) 


21o9 


Z/^7 






7253472H1 (BRAMNOAOl) 


'Si AC 


Zo33 






77037d4H1 (TJTRETUJiUl) 


22U3 


Z/VO 






g 12097556 


OO 1 A 

2219 


OCT* 






7U3u23oxll (i5KAC^IJJvu2j 




















g 192 12752 


ZZ44 


Zoi i 






7oo53DUHi (jdKABIJIKuz) 


zzy / 


ZCj** 






iCA/l/C10TC^ mTTT TO n TO A 1 \ 


OIK 


Z7XD 






902 18039 J 1 


OQ1 A 

ZoiU 


Zo /*f 






g30442850 


2905 


aAA>i 
39o4 






OBI.s29789976.edit2 


9AAO 

3908 


4Z7.7 


21/7526208CB1/ 


2067-2098, 


294477 1F7 (BRAITUT23) 


1 


CQ1 

3ol 


6538 


4319-5404, 










4173-4201, 1- 










1485, 6070- 










6538, 2539- 










3468 












GBI J4TJDlo354JD04,13.eaul 


ZI 


OD3o 






oU18737j1 (BMAKlAcOl) 


43o 








g3422499 


CQ 1 


inc7 






g33iOoUo 


CDQ 


inc? 






QIAOQ/^/ltTI /"CD ATMOX>A'2\ 

ol9oo04Jrll vi>KAilNUKU3) 


yuo 


l*»JO 






O 1 AOOiCil T 1 /Tin A TXT^D A^\ 

o19oo04j1 (BKAIInUKUJ} 










758030DH1 (BKALrcCUl) 


I4o3 


ZU43 






8159951J1 (MlXDlMcUz) 


lyfQA 
i4oO 


Zi^D 








1645 


2142 






90041465H1 


2127 


2864 






90041465F6 


2129 


2864 


f 




gS926184 


2304 


2751 






8267027Hi (MIXDUNF02) 


2382 


2971 






8498563H1 (BRSMTXFOl) 


2543 


3089 






g2 162976 


2564 


3118 






8123676H1 (HEAONOCOl) 


2568 


3068 






71891427V1 


3104 


3725 



7 



PF-1724P 



Table 4 



Poiynucleodde 
SEQIDNO'7 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


S* Position 


3' Position 






71893456V1 


3255 


3883 






2228879F6 (FROSNOT16) 


3376 


3897 






71892008V1 


3450 


4136 






gl 1290981 


3459 


4057 






1236920F1 (LUNGFET03) 


3476 


3968 






6517912F8 (BRAFTDT02) 


3478 


4027 






gl 1976492 


3479 


4046 






g24786219 


3481 


4172 






5601021 IHl 


3489 


4296 






gl0823987 


3502 


4173 






g24787165 


3522 


4172 






£30853338 


3533 


4165 






gl8521727 


3555 


4177 






gl3285388 


3564 


4310 






g6890090 


3578 


4049 






3967421F7 (PROSTUTIO) 


3643 


4157 






3967421F6 (raOSTUTlO) 


3643 


4275 






g2836991 


3652 


4172 






g28094190 


3663 


4244 






1803939F6 (SINTNOT13) 


3676 


4334 






g22767266 


3699 


4147 






gl0825536 


3806 


4475 




■ 


4827574F6 (BLADDITOl) 


3959 


4517 






4827574T8 (BLADDITOl) 


3972 


4672 






7682581T8 (BRABDIKQ2) 


4006 


4691 






g24779423 


4007 


4745 


• 




g24794912 


4028 


4745 






gl2102435 


4040 


4746 






g9876920 


4043 


4695 






g 197343 18 


4067 


4745 






g23285997 


4076 


4745 






g27792216 


4078 


4745 






gl 1364160 


4104 


4659 






g24775063 


4116 


4745 






gl9729780 


4155 


4745 






gl9755171 


4165 


4745 r 






g21175144 


4184 


4745 






1803939T6 (SINTNOT13) 


4194 


4686 






1625628T6 (COLNPOTOl) 


4195 


4685 






2228879T6 (PROSNOT16) 


4215 


4689 






4827574T6 (BLADDITOl) 


4216 


4688 






g23286214 


4242 


4745 






2851843T6 (BRSTTUT13) 


4251 


4702 






2153236F6 CBRAINOT09) 


4284 


4745 






7621418J1 (HEARFEE03) 


4873 


5519 






gl2413646 


4878 


5586 






gl0319848 


5089 


5766 
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PF-1724P 



Table 4 



Polynucleotide 
SEQIDNO'J 
IrtcyteED/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5' Position 


3* Position 






6610468H2 (KIDNTMCOl) 


5144 


5808 






414471F1 (BRSTNOTOl) 


5483 


6067 






648067 IHl (PROSTMCOl) 


5491 


6053 






7326473R8 (THYMNOE02) 


5502 


6329 






gl03 18203 


5571 


6067 






7621418H1 (HEARFEE03) 


5580 


6040 






g3099121 


5586 


6065 






70128 18F7 (KIDNNOCOl) 


5587 


6067 






7012918F8 (KIDNNCX:01) 


5587 


6069 






gl0035211 


5603 


6069 






g4188696 


5614 


6067 


22/7526212CB1/ 
2349 


1442*1485, 
2116-2349, 1- 
1049 


294477 1F7 (BRAITUT23) 


1 


581 






GBIJIT^016354„004.13.editl 


1 


2290 






8018737J1 (BMARTXEOl) 


438 


1053 






£3422499 


581 


1057 






g3330808 


589 


1057 






8198864H1 (BRAINOR03) 


906 


1436 






8198864J1 (BRAINOR03) 


1332 


1978 






7580306H1 (BRAIFECOl) 


1483 


2045 






73073134V1 


1608 


2349 


23/7526213CB1/ 
SO 15 


6365-6398, 
7104-7128, 
4028-5749, 1- 
2254, 2817- 
3243 


GBI_NT_016354_003. 15.editi 


1 


8009 






9Q214127H1 


203 


877 






9815193U2 


203 


877 






9775316U2 


203 


936 






9822048U1 


571 


1314 






9770976U2 


571 


1436 






978598 lUl 


571 


1450 






9770976U1 


577 


1378 






9785972U1 


586 


1449 






9822048U2 


594 ~ 


-1332-.-- 






9746466U2 


652 


1450 






9773732U2 


731 


1652 






y/o4110U2 


747 


1525 






97841 lOUl 


756 


1532 






9770966U1 


799 


1670 






9796042U2 


806 


1686 






9746439U1 


833 


1552 






9746180U2 


833 


1695 






9746439U2 


833 


1721 






9738822U2 


833 


1768 






9770980U1 


843 


1675 



9 



PF.1724P 



Table 4 



Polynucleotide i 
SEQIDNO:/ 1 
IncytelD/ 
Sequence Length 


Selected : 
Fragments 


Sequence Fragments t 


5* Position I 


y Position 






9770980U2 


849 


1612 






^46180U1 


850 


1644 






m0984U2 


850 


1675 






977097001 


855 


1615 






9770970U2 


856 


1771 






9770962U2 


857 


1624 






9746239U2 


857 


1678 






9746359U1 


915 


1764 






9811817U2 


928 


1675 






9822051U2 


928 


1768 






9822051U1 


928 


1899 






9770964U1 


933 


1753 






9770964U2 


933 


1779 






9773790U2 


934 


1774 






97gS982U2 


943 


1759 






9746294U1 


947 


1898 






9746294U2 


950 


1844 






9822053U1 


1827 


2719 






9785984U2 


1829 


2642 






9785975U2 


1829 


2704 






9785975U1 


1829 


2737 






9785984U1 


1830 


2720 






9746440U2 


2108 


3060 






9746414U1 


2154 


3036 






9770983U2 


2155 


3043 






9786418U1 


2160 


3084 






9746414U2 


2161 


3052 






9770973U2 


2172 


3054 






9822054U1 


2174 


3015 






9822aS4U2 


2180 


2972 






9785985U1 


2212 


3066 






9785976U1 


2212 


3163 






9785976U2 


2215 


2958 






9785985U2 


2215 


3030 






9770975U1 


2277 


3051 






9770979U1 ' 


2281- - ~ 


3068-= 






9785986U2 


2422 


3338 






98220S5U1 


2430 


3130 






9785986U1 


2430 


3212 






9785977U2 


2430 


3244 






9822055U2 


2430 


3322 






978411 1U2 


2434 


3184 






gl0934151 


5975 


6837 






gl5763429 


6557 


7314 






gl8513541 


7215 


7903 






g2 1477809 


7241 


8015 



10 



PF-1724P 



Table 4 



Polynucleotide 

SEQiDNay 

Incyte DD/ 
Sequence Length 


Selected 
Fragme'nts 


Sequence Fragments 


5' Position 


3* Position 


24/7526214CB1/ 
7945 


7039-7063, 1- 
2933, 3938- 
5684, 6300- 
6373 


GBIJWj016354.003.15.editl 


1 


7945 






9775320U2 


203 


668 






9775320U1 


203 


955 






9785981U2 


492 


1352 






9785990XJ2 


507 


1304 






9770976U2 


507 


1371 






9785981U1 


507 


1385 






9770976U1 


513 


1313 






9785972U2 


514 


1304 






9785972U1 


522 


1384 






9746466U2 


588 


1385 






9770978U2 


692 


1396 






9775320U2 


707 


787 






9770974U1 


735 


1548 






9796042U2 


742 


1621 






9746439U1 


769 


1487 






9746439U2 


769 


1656 






9738822U2 


769 


1703 






97462i5U2 


785 


1551 






9770984U2 


786 


1610 






9770970U1 


787 


1550 






9770962U2 


787 


1559 






9770970U2 


788 


1706 






9770968U2 


790 


1712 






9746215U1 


791 


1488 






9746239U2 


792 


1613 






9770968U1 


796 


1676 






9811817U2 


863 


1610 






982205 1U2 


863 


1703 






982205 lUl 


863 


1834 






9770964U1 


868 


1688 






9770964U2 


868 


1714 






9773790U2 


869 - ' 


1709- - - 






9785982U2 


878 


1694 






9746294U1 


882 . 


1833 






9746294U2 


885 


1779 






9822053U1 


1762 


2657 






9785984U2 


1764 


2579 






9785975U2 


1764 


2642 






9785975U1 


1764 


2675 






9785984U1 


1765 


2658 






9746295U1 


2418 


3318 






9811820U2 


2583 


3356 






95037369J1 


2664 


3334 



11 



EP-1724P 



Table 4 



Polynucleotide 
SEQIDNO:/ 
IncytelD/ 
Sequence Lengdi 


Selected 
Fragments 


Sequence Fragments 

1 


5* Position 


3* Position 






95048903H1 


2664 


3380 






95049151Hi 


2664 


3382 






9504885 IHl 


2664 


3382 






95037269J1 


2664 


3382 






950491 19J1 


2664 


3389 






95049103J1 


2664 


3405 






95049127J1 


2664 


3422 






9504893SH1 


2664 


3431 






950490ilHl 


2664 


3445 






95049135J1 


2664 


3454 






95048927H1 


2664 


3458 






95048943H1 


2664 


3458 






95037385H1 


2664 


3459 






95037377J1 


2664 


3464 






950491 83H1 


2664 


3469 






95037277J1 


2664 


3469 






95049191H1 


2664 


3470 






95037393Hi 


2664 


3471 






9504889 IHl 


2664 


3480 






95048975H1 


2664 


3510 






95048983H1 


2664 


3511 






95037293H1 


2664 


3532 






9504891 IHl 


2664 


3562 






95048919J1 


2668 


3458 






95049075H1 


2668 


3609 






9504901 IJl 


2668 


3617 






95048991H1 


2706 


3617 






95048935J1 


2711 


3617 






95037293J1 


2776 


3617 






95037369H1 


2779 


3617 






95037385J1 


2779 


3617 






9504895 IJl 


2779 


3617 






9743770U1 


2781 


3654 






95048903J1 


2782 


3617 






90214227J1 


2784 


3654 






95037377H1 


2804 - ' 


3617 - - 






977573 1U2 


2810 


3654 






90214259J1 


2823 


3653 






9775730U2 


2848 


3654 






9785996U2 


2849 


3628 






9785996U1 


2849 


3651 






9785978U1 


2849 


3654 






9775731U1 


2854 


3654 






90214227R6 


2876 


3653 






9775722U2 


2876 


3654 






9775718U2 


2876 


3654 






9775723U2 


2899 


3650 
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1^1724? 



Table 4 



Polynucleotide 
SEQIDNCh/ 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5* Position 


3* Position 






9775725U2 


2906 


3654 






9775719U1 


2915 


3654 






9791941U1 


2949 


3654 






9775718U1 


2952 


3654 






9775720U1 


2971 


3654 






g24792l02 


3008 


3682 






gl40473O7 


3181 


3922 






8138972T1 (SPLNNOTIO) 


3471 


4320 






gl9732719 


3797 


4551 






8760604H1 (MYEPUNNOl) 


4066 


4870 






8720327H1 (TLYJUNFOl) 


4320 


5134 






8717314H1 (TLYJTXF03) 


4541 


5355 






8502029H1 (KIDEUNFOl) 


4764 


5490 






g21 170624 


5247 


5934 






gl0934151 


5910 


6772 






gl5763429 


6492 


7250 






gl0153702 


6854 


7555 






gl85 13541 


7150 


7839 






g21477809 


7176 


7945 


25/7526228CB1/ 
3149 


1298-1355, 
2272-3149 


gl4083204 
• 


1 


528 






8507486H1 (SMCCTXFOl) 


1 


719 






GBLNT„007299_017.12.editl 


30 


3149 






7953010H1 (SYNONOCOl) 


217 


706 






73414963V1 


229 


740 






gl5748947 


239 


1013 






8711164HI (MYEPUNFOl) 


243 


910 






gl5759491 


263 


996 






95104290J1 


402 


1249 






95104234H1 


1515 


2354 






71634372V1 


1538 


2050 






95104302H1 


1569 


2354 






8208729H1 (LIVRTXSQ2) 


1889 


2602 






gl3521551 


2023 


2773 






71769306V1 


2112 - - - 


2648- 






71638426V1 


2124 


2819 






71635816V1 


2147 


2671 






gl236S044 


2211 


2688 






gl 1014334 


2566 


3119 


26/7526246CB1/ 
3617 


1-1020,2999- 

3617 


95079085Hi 


1 


563 






GBL1859885.71 


118 


3617 






6828803J1 (SINTNOROl) 


407 


1087 






gi4290834 


412 


1145 






7401136H1 (SINIDMEOl) 


506 


1090 
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PF-1724P 



Table 4 



Polynucleotide 
SEQ2DNO:/ 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5* Position 


3* Position 






7741107H1 (THYMNOEOl) 


521 


563 






glS763036 


703 


1270 






95079085H1 


869 


1158 






7741i07Hl (THYMNOEOl) 


869 


1501 




• 


gl 1937225 


1194 


1870 






gl3915792 


1194 


1985 






gl3916980 


1194 


1992 






9743772U2 


1358 


2314 






8183605H1 (EYEElNONOi) 


1371 


2039 






95078961H1 


1373 


2314 






95079093H1 


1376 


2303 ^ 






950787 13J1 


1385 


2314 






gl9891326 


1421 


2000 






gl9895139 


1423 


2016 






981712IU4 


1423 


2314 






gl9370851 


1429 


2101 






56082957H1 


1439 


2254 






9817121U3 


1455 


2314 






gl4054691 


1457 


2209 






95078937J1 


1466 


2314 






95C78777H1 


1471 


2313 






95078913J1 


1471 


2314 




I 


95078729J1 


1478 


2314 






95078985H1 


1480 


2314 






95043360J1 


1480 


2314 






95078993H1 


1492 


2313 






95079077H1 


1494 


2313 






95043452J1 


1494 


2314 






90155990J1 


1495 


2351 






95078837J1 


1497 


2314 






974337 lUI 


1497 


2314 






90155858J1 


1497 


2351 






gl2766544 


1509 


2150 






90155874J1 


1509 


2351 






95079005F6 


1515 


2314 






95a78793Hl 


1519 


2303 - - 






95079005H1 


1520 


2314 






gl 1641958 


1521 


2249 






9o04343oJl 


1529 


2314 






9805472U2 


1529 


2314 






g8658271 


1539 


2152 






90151356J1 


1542 


2130 






90161465H1 


1542 


2336 






90161565H1 


1542 


2367 






90161481H1 


1542 


2412 






9805472U1 


1544 


2314 






gl2612323 


1545 


2247 
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PF.1724P 



Table 4 



Polynucleotide i 
SEQIDNO-7 1 
IncytelD/ 
Sequence Length 


Selected S 
"ragments 


Sequence Fragments - 


)* Position 2 


r Position 1 






7721181J1 (THYRDIEOl) 


1557 : 


2187 1 




f 


J5078893H1 


1568 : 


2313 1 






^5078861H1 


1568 : 


2314 1 




i 


^5078885H1 


1568 : 

— - 


23 14 1 






72677129V1 


1592 : 


Z177 






?5079093F6 


1594 


2303 j 






9507901311 


1598 


2314 1 






g 12684079 


1599 


2282 1 






90155866J1 


1599 


2351 






gl85 15296 


1601 


2189 1 






9743772U1 


1613 


2314 j 






90154817H1 


1616 


2267 






95078705H1 


1616 


2314 






95078829J1 


1616 


2314 1 






622371114 


1625 


2252 






^1120536 


1631 


2190 






95078853J1 


1638 , 


2313 






gl6175992 


1648 


2404 






7726720J1 CUTRCDlEGl) 


1653 


2261 ' 






7690765J1 CPROSTME06) 


1656 


2242 






gl9367868 


1661 


2344 






72681393V1 


1677 


2397 I 






£15348804 


1679 


2352 






993893R6 (COLNNOTll) 


1688 


2368 






g 10454044 


1714 


2320 






g9720208 


1715 


2528 






7621966J1 (HEARFEE03) 


1728 


2280 1 






6054035F6 {BRAENOT04) 


1737 


2340 1 






g30775472 


1740 


2392 






95079029J1 


1744 


2314 






gl0320808 


1773 


2465 1 






90155982H1 


1782 


2351 1 






gl4814460 


1797 


2414 






7723759J1 (THYRDIEOl) 


1824 


2428 






8752975H1 (TLYJTXN02) 


2599 


3242 


27/7526258C351i 
1955 


' 1-41, 1882- 
1955. 1529- 
1598 


GBl.NTJX)7914J)13;10.cditl 




1955— -r- 






gl3910802 


81 


946 1 






73381212D1 


715 


1172 






gl9374315 


812 


1528 






721202 IH2 (BLYRTXT03) 


987 


1540 


28/ 752631 ICBl 
2937 


/ 1394-1429, 1- 
92.2612-2937 


GBW28308.PT127.1 


1 


2937 






95117349H1 


122 


658 






95117218H1 


122 


690 



IS 



PF-1724P 



Table 4 



Polynucleotide 
SEQIDN07 
Incyte ID/ 
Sequence Length 


Selected 
E^Bgments 


Sequence Fragments 


S* Position 


3* Position 






95117318Hi 


122 


739 






GBL95I17318CL1 


123 


1393 






9511731811 


539 


1393 






95117349J1 


621 


1393 






95117218J1 


807 


1393 






gl0357922 


1099 


1644 






6701408H1 (DRGCNOT02) 


1237 


1848 






2745158H1 (LUNGTUTll) 


1251 


1561 






8374138J1 (MIXDUNN16) 


1390 


2042 






glQ208S13 


1392 


2069 






gill 12201 


1396 


2052 






g3 1804533 


1430 


1987 






7651467F6 (STOMTDpOl) 


1495 


1949 






7651467H1 (STOMTDEOl) 


1502 


1940 






gl2336360 


1531 


2266 






S5094066H1 


1542 


2039 






55094066J1 


1542 


2039 






2927987F6 (TLYMNOT04) 


1549 


2106 




- 


g24902268 


1579 


1965 






g31010996 


1658 


2132 






g9772813 


1731 


2380 






g 12427833 


1737 


2490 






gl 3581931 


1759 


2540 






6280867T8 (SKINDIAOl) 


1837 


2534 






g23285666 


1853 


2606 






g8909555 


1871 


2301 






1649261F6 (PROSTUT09) 


1880 


2223 






2596906F6 (OVARTUT02) 


1925 


2496 






268900T6 (HNT2NOT01) 


1960 


2513 






2921276F6 (SININar04) 


1971 


2483 






gl476946 


1984 


2421 






gl479766 


1984 


2491 






55005237J1 (PHDEDNV02) 


1984 


2567 






gl484668 


1984 


2611 






gl927463 


1992 


2496 






907626R2 (COLNNOTG9) 


2000 


2404 — 






gl 1008053 


2015 


2565 






292 1293T6 (SININOT04) 


2021 


2502 






gl505878 


2033 


2567 






770052R1 (COLNCRTOl) 


2038 


2544 






g434511 


2060 


2394 






8450642J1 (MIXDTUNOl) 


2062 


2561 






gl 1014883 


2066 


2606 






1649261T6 (PROSTUT09) 


2083 


2510 






gl0821187 


2107 


2572 






839823QT1 (SPLNNOT04) 


2109 


2500 






7628312H1 (GBLADEEOl) 


2129 


2573 
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PF-1724P 



Table 4 



Polynucleotide I 
SEQIDNO:/ I 
IncytelD/ 
Sequence Length 


Selected : 
Fragments 


Sequence Fragments ' 


>' Position 


y Position 






g2706188 


213d 


23 D3 






g3899217 


zi3y 


23 






g5742157 


2147 


23D7 






4163995T6 (BRSTNOT32) 


2154 


2347 


• 




g73 19568 


2156 


ZoOO 






g4188364 


2160 


Zo09 






6562933H1 (MCLDTXT04) 


2169 


2741 






gl 15 12426 


2182 


2609 


• 




g274460 


2223 


2552 






g2567C99 


2240 


2586 






g7319495 


2246 


2606 






g4620813 


2247 


2S68 






gl3584179 


2247 


2602 






2927987T6 (TLYMNOT04) 


2328 


2878 






8556290T2 (LIJNGNOT30) 


2355 


2860 






1299477T6 (BRSTNOT07)_ 


2374 


2890 






gl975853i 


2472 


2913 






gl08 10599 


2478 


2937 


29/7526315CB1/ 
6122 


1-88, 983-2443. 
3020-5194 


GBI-928294.PT122J0 


1 


6121 






9790480U1 


193 


871 






9709180U2 


193 


871 






9807280U2 


193 


871 






9807280U1 


193 


871 






9709180U1 


193 


871 






71866765V1 


310 


931 






72697902V1 


573 


982 






72343088V1 


867 


1481 






72343412V1 


867 


1492 






72343082V 1 


867 


1496 






72343409V 1 


867 


1573 






72343 152V1 


867 


1595 






72343264V 1 


890 


lo09 






72342802V1 


919 


io43 






72343571 VI ... 


922-: ' — 


< 'XD33— . — 






72006034V1 


1103 


1948 






8514925H1 (BRSTUNFOl) 


2143 


2950 
















6610909J1 (PLACFER06) 


2423 


3023 






6935902F8 (SINTTNIR02) . 


2471 


3073 






7314251H1 (UTREDME02) 


2669 


3237 






73396792V1 


2738 


3490 






73396412V1 


2738 


3554 






73396412D1 


2738 


3554 






8286949F6 (OVARDIN02) 


2800 


3574 






8286949T6 (OVARDIN02) 


3049 


3795 



I 

17 



PF-1724P 



Table 4 



Polynucleotide S 
SEQIDNO./ I 
IncytelD/ 

Sequence Length 


selected S 
<fagments 


equence Fragments 5 


* Position 3 


* Position 1 




1 


625836H1 (K1DN1*1£H02) 3 


^247 3 


vif^j 1 






512613318 ^ 


kS79 t 


^357 1 






L501689F6 (SINTBSTOl) ^ 


^789 f 


^13 1 






{12095619 ' 


>055 t 


>Do3 1 






J243847J1 (BONEUNROl) t 


>121 t 


>809 1 






jl 1979244 i 


5167 f 


)854 1 






3016112T6(MUSCNOT07) i 


5317 ( 


>012 1 




\ 


5546927T1 (OVARTUTOl) ! 


5343 i 


S045 1 






R12763553 


5356 < 


S064 1 






B12758671 


5370 


5043 } 1 






8736604J1 (BRAJNON03) 


5370 


5116 1 






7753003J1 (HEAONOEOl) 


S405 


6106 I 






58004288J1 


5410 


6122 






g23283197 


5424 


6117 






7752327H1 (HEAONOEOl) 


5458 


6108 1 






58004372H1 


5461 


6122 1 






896404T2 (BRSTNOT05) 


5479 


6079 j 






824799829 


5493 


6117 1 






1682961T7 (PROSNOT15) 


5495 


6063 1 






el5996499 


5524 


6117 1 


30/7526442CB1/ 
1914 


1826-1914 


g30290081 


1 


437 1 






90004721JI 


1 


647 1 






GBLg22046009.editl 


1 


1893 I 






GBIJFL931374.29 


1 


1914 1 






90134244J1 


260 


862 1 






90134244H1 


260 


863 1 






70623209V1 


684 


1206 1 






71565564V1 


684 


1379 1 






70626918V1 


720 


1262 1 






71563390V1 


741 


1452 1 






71564044V1 


744 


1453 1 






ftl728583 


748 


1386 1 






71567352V1 


750 


1276 1 






70623897V 1 


760 - 


1283 - — ^ — 1 






70622825V 1 


762 


1262 1 






70645387VI 


784 


1394 1 






70043337 VI 




1234 1 






71566273V1 


789 


1378 






70494443V 1 


791 


1267 






71567483V1 


794 


1261 






90004721H1 


796 


1623 






7925523H2 fCOLNTUSa2) 


807 . 


1405 






70625410VI 


814 


1325 






70625722V1 


818 


1406 






90004737H1 


821 


1623 
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Table 4 



Polynucleotide 
SEQIDNO:/ 
Incyte ID/ 
Sequence Length 


Selected 
Fiagments 


Sequence Fragments 


5' Position 


3' Position 
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1278 






71564547V1 


845 


1332 
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ID f i 
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1403 
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This Page is inserted by IFW Indexing and Scanning 
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BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the 
original documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 
'□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

.□^ BLURED OR ILLEGIBLE TEXT OR DRAWING 
SKEWED/SLANTED IMAGES 

□ COLORED OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 
□'^PERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 



IMAGES ARE BEST AVAILABLE COPY. 
As rescanning documents will not correct images 
problems checked, please do not report the 
problems to the IFW Image Problem Mailbox 



